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SYNTHETIC FATTY ACID DESATURASE GENE 
FOR EXPRESSION IN PLANTS 

This application claims priority to U.S. 
Provisional Application No. 60/097,586, filed August 24, 
1998, the entirety of which is incorporated by reference 
herein . 

FIELD OF THE INVENTION 

This invention relates to the field of genetic 
engineering, and more particularly to transformation of 
plants with heterologous fatty acid desaturase genes 
modified for optimum expression in plants. 

BACKGROUND OF THE INVENTION 

Several publications are referenced in this 
application in order to more fully describe the state of 
15 the art to which this invention pertains. The disclosure 
of each of these publications is incorporated by 
reference herein. 

Alteration of fatty acid desaturation in plants 
is of interest to plant biologists and food scientists 
20 alike, due to the influence of unsaturated fatty acids on 
the health benefits and flavors of foods, as well as the 
role of these molecules in plant biological processes. 
For a nation interested in healthy diet, the quality of 
fats and oils depends on their fatty acid composition, 
25 with oils high in monounsaturated fatty acids (e.g., 

canola, olive) gaining popularity as new health benefits 
are discovered. Considering the flavors of plant foods, 
many flavor-producing compounds are derived from 
peroxidation of unsaturated fatty acids. Thus, efforts 
30 are being made to produce plants with increased amounts 



5 



10 



WO 00/11012 



PCT/US99/19443 



-2- 

of unsaturated fatty acids, preferably monounsaturated 
fatty acids. 

In animal and fungal cells, monounsaturated 
fatty acids are aerobically synthesized from saturated 
5 fatty acids by a microsomal A- 9 fatty acid desaturase 
that is membrane bound and cytochrome b 5 -dependent . A 
double bond is inserted between the 9- and 10-carbons of 
palmitoyl (16:0) and stearoyl (18:0) CoA to form 
palmitoleic (16:1) and oleic (18:1) acids. In the 

10 reaction mechanism, electrons are transferred from NADH- 

dependent cytochrome b 5 reductase, via the heme -containing 
cytochrome b 5 (Cyt b 5 ) molecule, to the A- 9 fatty acid 
desaturase. The major form of cytochrome b 5 in animal, 
fungal and plant cells exists as an independent protein 

15 molecule that is anchored to the membrane by a short, 
carboxyl terminal, hydrophobic stretch of amino acids. 
The carboxyl terminal anchor orients the heme group of 
the Cyt b 5 on the membrane surface and allows it to 
translationally diffuse across the surface of the 

20 membrane. This property of lateral mobility allows this 
form of cytochrome b 5 to participate as an electron donor 
to a number of different proteins that catalyze a variety 
metabolic reactions on the membrane surface, including 
fatty acid desaturases, various sterol biosynthetic 

25 enzymes and a variety of cytochrome P450 mediated 

reactions. While this contributes to the versatility of 
Cyt b 5 as an electron donor, it also implies that the 
major form of cytochrome b 5 shuttles between its redox 
partners by translational diffusion across the surface of 

3 0 the membrane (Strittmatter and Rogers, Proc . Natl. Acad. 
Sci. USA, 72: 2658-2661, (1975; Lederer, Biochimie 76: 
674-692, 1994). Furthermore, this mechanism suggests 
that an independent, membrane bound cytochrome b 5 molecule 
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can potentially limit the rate of the metabolic reaction, 
depending on its abundance, its location on the membrane 
surface, its proximity to the electron acceptor, and the 
rate at which it can move and orient itself to the 
5 acceptor on the membrane surface. 

In plants, unsaturated fatty acids are formed 
and incorporated into complex lipids in two distinct 
cellular compartments. De novo fatty acid synthesis 
occurs almost exclusively in the plastids, producing the 

10 saturated species 16:0-ACP (acyl carrier protein) and 
18:0-ACP. 18:1-ACP is formed from 18:0-ACP in the 
plastid by a soluble, f erredoxin-dependent A- 9 
desaturase. These fatty acids are then shunted into one 
of two routes - a plastid- localized "procaryotic" pathway 

15 or a cytosolic/ER (endoplasmic reticulum) "eucaryotic" 
pathway - for further modification and acylation into 
glycerolipids (Somerville and Browse, Science 252 : 80-87, 
19 91) . The acyl ACPs that are shunted into the 
prokaryotic pathway remain within the plastid and are 

20 used for the synthesis of phosphatidic acid and further 
conversion to chloroplast glycerolipids. The fatty acyl 
groups of those lipids may be further desaturated by 
plastid desaturases that also use ferrodoxin as the 
electron donor. 

2 5 Acyl -ACPs that are shunted into the eukaryotic 

pathway are converted to free fatty acids, transported 
across the chloroplast membrane into the cytoplasm where 
they are converted to acyl CoA thioesters by acyl CoA 
synthetase. Those fatty acids are then converted to 

3 0 cytoplasmic/ER phosphatidic acid which can then be 

converted to membrane glycerophospholipids , or storage 
lipids, in the form of triacylglycerols and sterol esters 
that are the major components of plant oils. 
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Most polyunsaturated 18 -carbon plant fatty- 
acids appear to be formed in the cytosol by the ER-bound 
desaturases (Table 1). Once the 18:1 fatty acid is 
incorporated into phospholipid, an ER-bound desaturase 
can catalyze the formation of a A- 12 double bond in the 
fatty acyl chain to form A- 9 ,12 18:2. Other ER bound 
desaturase enzymes can act on 18:2 to introduce a A-15 
double bond to form £9,12,15 18:3. These desaturase are 
thought to be similar to animal and fungal desaturases 
because they are membrane bound and appear to require a 
cytochrome b 5 -mediated electron transport chain. 



TABLE 1: 



Plant 

M. Ill 1 1 %t 


Gene 


Desaturase 
Type 


Primary 
Activity 


b5 
chimera 


Reference 


Arabidopsis 


FAD2 


A12, 

microsomal 


18:1->18:2 


no 


Okuley J. et al. 
Plant Cell 6: 147- 
158, 1994 


Arabidopsis 


FAD3 


A15, 

microsomal 


18:2->18:3 


no 


Shah S. & Z. Xin, 
Plant Physiol. 114: 
1533-1539, 1997 


Nicotiana 
tabacum 


NtFA 
D3 


A15, 

microsomal 


18:2->18:3 


no 


Hamada T. et al. 
Plant & Cell. 
Physiol. 37: 606- 
611, 1996, 
Hamada T. et al. 
Transgenic Res. 5: 
115-121, 1996 


Soybean 


FAD 
2-1 


A12, 

microsomal, 

developing 

seeds 


18:l-> 
18:2 


no 


Heppard E.P. et al. 
Plant Physiol. 110: 
311-319, 1996 


Soybean 


FAD 
2-2 


A12, 

microsomal 
developing 
seeds and 
vegetative 
tissues 


18:1->18;2 


no 


Heppard, E.P. et al. 
1996, supra 



15 



20 



25 



WO 00/11012 



PCT/US99/19443 



-5- 



Borage 




A-6 


18:2 


yes, N- 


Sayanova et al. 








(9,12)-18:3 


terminal 


Proc. Natl. Acad. 








(6,9,12) 




Sci. USA 94: 42 11- 












4216, 1997 



The conversion of saturated fatty acyl chains 
5 to monounsaturated species in plants appears to be 

confined to the chloroplast s . No A-9 desaturase activity 
has been identified in the cytoplasm or endoplasmic 
reticulum of plants. The soluble plant chloroplast A-9 
desaturase is highly specific for 18:0-ACP as a substrate 

10 and does not desaturate 16:0-ACP (Somerville and Browse, 
supra). As a result, only a small amount of 16:1 is 
present in most higher plants, while the pool of 16:0 is 
concomitantly larger due to its disfavor as a substrate 
for the plant desaturase. By comparison, a larger amount 

15 of 18:1 is found in higher plant cells, with a 

correspondingly lesser amount of 18:0. Thus, for the 
purpose of increasing the concentration of mono- 
unsaturated lipids in a plant, the 16:0 fatty acid 
constitutes a significant pool of available substrate 

20 that is under-utilized by the endogenous plant 
desaturase . 

In contrast to the plant A-9 desaturase, fungal 
and animal A-9 desaturases efficiently convert a wide 
range of saturated fatty acids with differing hydrocarbon 

25 chain lengths to monounsaturated fatty acids. The 

Saccharomyces cerevisiae enyzme, for example, efficiently 
desaturates even and odd chain fatty acyl CoA substrates 
from 13 carbons to 19 carbons in length. A broad 
functional homology exists among various Cyt b 5 -dependent 

3 0 desaturases, as evidenced, for example, by the successful 
expression of the rat A-9 desaturase in yeast (Stukey et 
al., J. Biol. Chem. 265 : 20144-20149, 1990). 
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The rat and yeast A- 9 desaturase genes have 
been expressed in plants: both the rat and the yeast 
genes have been expressed in tobacco (Grayburn et al . , 
BioTechnology 10 : 675-678, 1992 (rat); Polashock et al . , 
5 Plant Physiol. 100: 894-901, 1992 (yeast), and the yeast 
gene has also been expressed in tomato (Wang et al . , J. 
Agric. Food Chem. 44: 3399-3402, 1996) . The yeast A-9 
desaturase has been shown to function in tobacco and 
tomato, leading to increases in the level of 

10 monounsaturated fatty acids (both 16:1 and 18:1) and 

other compounds derived from monounsaturated fatty acids 
(e.g., polyunsaturated fatty acids, hexanal , 1-hexanol, 
heptanal, tjrans-2-octenal) (Polashock et al . , supra; Wang 
et al ; supra) . Expression of the rat desaturase also led 

15 to an increase in monounsaturated 16- and 18 -carbon fatty 
acids (Grayburn et al . , supra) . 

From the foregoing, it can be seen that 
transgenic plants expressing animal or fungal A-9 
desaturase genes can be improved in their unsaturated 

20 fatty acid composition by virtue of the activity of the 
foreign enzyme. Of further advantage, it has recently 
been discovered that some fungal A-9 desaturases (e.g., 
Saccharomyces cerevisiae) are fusion proteins comprising 
an intrinsic Cyt b 5 domain (Mitchell & Martin, J. Biol. 

25 Chem. 270: 29766-29772, 1995) . When this gene is 

expressed, sufficient Cyt b 5 is produced to drive the 
desaturase reaction at an optimum level and is not 
dependent on existing plant Cyt b 5> The known animal A-9 
desaturases do not contain this fused Cyt b 5 motif and 

3 0 must rely on independently-produced Cyt b 5 to provide the 
electrons for the reactions. 

Though fungal or animal A-9 desaturases (e.g. 
the S. cerevisiae desaturase or the animal desaturases) 
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may be expressed and functional in certain plants, their 
expression is likely less than optimal in plants, and 
expression may not even be possible in other plant 
species, due to several factors, including differences in 
5 codon usage and codon preference in plants as compared to 
fungi, and among different plant species and the presence 
of cryptic intron splicing signals, among others. All of 
these factors can lead to poor expression, or no 
expression, of a non-plant foreign gene in a plant cell. 

10 Accordingly, in order to make use of non-plant 

fatty acid desaturases, particularly those such as the S. 
cerevisiae A- 9 desaturase comprising an internal Cyt b 5 
motif, a need exists to design modified desaturase- 
encoding DNA molecules that are customized for expression 

15 in plant cells and specific plant tissues. It would be 
of even greater advantage to optimize such modified DNA 
molecules for expression in particular plant species, 
such as those that are grown and harvested primarily for 
oils . 

20 

SUMMARY OF THE INVENTION 

According to one aspect of the invention, a 
synthetic fatty acid desaturase gene for expression in a 
multi-cellular plant is provided, the gene comprising a 
25 desaturase domain and a Cyt b 5 domain, wherein the gene is 
customized for expression in a plant cytoplasm. In one 
embodiment, the synthetic gene is customized for 

* 

expression in a monocotyledonous plant . In another 
embodiment, the synthetic gene is customized for 
30 expression in a dicotyledonous plant. In a preferred 
embodiment, the synthetic gene is customized for 
expression in a plant genus selected from the group 
consisting of Arabidopsis , Brassica , Phaeseolus , Oryza, 
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Olea, Elaeis (Oil Palm) and Zea. 

In a preferred embodiment of the invention, the 
desaturase is a cytosolic A-9 desaturase . The 
Saccharomyces cerevisiae A-9 desaturase is particularly 
5 preferred. 

In another embodiment of the invention, the 
synthetic gene is customized from a naturally occurring 
gene comprising both a desaturase domain and a cyt b 5 
domain. Alternatively, the synthetic gene is a chimeric 

10 gene comprising a desaturase domain and a heterologous 
cyt b 5 domain. 

In another embodiment, the synthetic gene is 
customized from a naturally occurring gene such that the 
synthetic gene and the naturally occurring gene encode an 

15 identical amino acid sequence. Alternatively, the 

synthetic gene is customized from a naturally occurring 
gene such that the synthetic gene and the naturally 
occurring gene encode a similar and functionally 
conserved amino acid sequence. 

20 In another embodiment, a naturally occurring or 

a synthetic gene is customized so that specific amino 
acid modification are made to enhance the function of the 
encoded protein. Examples of such modifications include 
changing amino acids that are subjected to 

25 phosphorylation or other post - translat ional modifications 
that may alter or regulate the activity of the A-9 
desaturase enzyme . 

In another embodiment of the invention, 
elements of a naturally occurring or a synthetic 

30 desaturase gene that are not essential for enzymatic 
function are replaced or linked with elements derived 
from plant ER lipid biosynthetic genes that are normally 
expressed in maturing seeds or other plant tissues. The 
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improved expression of the modified gene produced by the 
inclusion or substitution of plant DNA sequences in the 
synthetic gene will result from native plant signal or 
control elements in those sequences that affect 
5 desaturase gene expression at one or more levels. 

According to another aspect of the invention, a 
method is provided for constructing and customizing a 
bifunctional desaturase/cyt b 5 encoding gene for 
expression in the cytosol of a multicellular plant. The 

10 method comprises (a) providing a DNA molecule comprising 
a desaturase -encoding moiety operably linked to a cyt b 5 - 
encoding moiety, said DNA molecule producing the 
bifunctional polypeptide in a non-customized form; (b) 
back- translating the polypeptide sequence using preferred 

15 codons for expression in a multicellular plant, thereby 
producing a back- translated nucleotide sequence; (c) 
analyzing the back- translated nucleotide sequence for 
features that could diminish or prevent expression in the 
plant cytoplasm, including, optionally (1) probable 

20 intron splice sites (characterized by T-rich regions) ; 
(2) plant polyadenylation signals (e.g., AATAAA) ; (3) 
polymerase II termination sequence (e.g., CAN (7 _ 9) AGTNNAA, 
where N is any nucleotide) ; (4) hairpin consensus 
sequences (e.g., UCUUCGG) ; and (5) the sequence- 

25 destabilizing motif ATTTA; (d) modifying the analyzed 
sequence to correct or remove the features that could 
diminish or prevent expression in the plant cytoplasm; 
and, optionally, (e) introducing desirable cloning 
features, such as restriction sites, into the sequence in 

30 a manner that does not materially affect the desired 
codon usage or final polypeptide sequence. 

The method set forth above may be adapted by 
incorporating into the customized gene one or more 
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genomic segments from plant desaturase or other ER lipid 
biosynthetic genes, which are determined to further 
optimize gene expression in plants. This method 
comprises (1) identifying cDNA sequences that have 
5 potential to comprise such beneficial elements, (2) 
creating yeast vectors expressing desaturase genes 
modified to contain these elements, (3) testing the 
vectors in a yeast expression system, (4) isolating 
regions from genomic DMA that are homologous to the 
10 beneficial cDNA elements, and (6) using them to construct 
chimeric or hybrid synthetic genes that produce 
functional and highly efficient desaturase activities in 
plant tissues. 

15 Other features and advantages of the present 

invention will be better understood by reference to the 
drawings, detailed description and examples that follow. 

BRIEF DESCRIPTION OF THE DRAWINGS 

2 0 Figure 1. GCG Pileup comparison of stearoyl- 

CoA desaturase protein sequences . Sequences containing a 
Cyt b 5 domain are indicated with a +; sequences lacking a 
Cyt b 5 domain are indicated with a -; sequences still in 
question are indicated with a ?. 

25 Figure 2. GCG Pileup comparison of Cytochrome 

b 5 protein sequences. 

DETAILED DESCRIPTION OF THE INVENTION 

I . Definitions 

30 Various terms relating to the biological 

molecules of the present invention are used herein above 
and also throughout the specifications and claims. 

The term "promoter region" refers to the 5 1 
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regulatory regions of a gene. 

The term "reporter gene" refers to genetic 
sequences which may be operably linked to a promoter 
region forming a transgene, such that expression of the 
5 reporter gene coding region is regulated by the promoter 
and expression of the transgene is readily assayed. 

The term "selectable marker gene" refers to a 
gene product that when expressed confers a selectable 
phenotype, such as antibiotic resistance, on a 

10 transformed cell or plant. 

The term "operably linked" means that the 
regulatory sequences necessary for expression of the 
coding sequence are placed in the DNA molecule in the 
appropriate positions relative to the coding sequence so 

15 as to effect expression of the coding sequence. This 

same definition is sometimes applied to the arrangement 
of coding sequences and transcription control elements 
(e.g. promoters, enhancers, and termination elements) in 
an expression vector. 

20 The term "DNA construct" refers to genetic 

sequence used to transform plants and generate progeny 
transgenic plants. These constructs may be administered 
to plants in a viral or plasmid vector. Other methods of 
delivery such as Agrobacterium T-DNA mediated 

25 transformation and transformation using the biolistic 

process are also contemplated to be within the scope of 
the present invention. The transforming DNA may be 
prepared according to standard protocols such as those 
set forth in "Current Protocols in Molecular Biology", 

30 eds. Frederick M . Ausubel et al . , John Wiley & Sons, 
1999. 

This invention provides synthetic DNA molecules 
(sometimes referred to herein as ''synthetic genes") that 
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encode a fatty acid desaturase useful for modifying the 
fatty acid composition of a plant. The DNA molecules 
describe in accordance with this invention are superior 
to DNA molecules currently available for this purpose, in 
5 two important respects: (1) they encode a dual -domain 
polypeptide (sometimes referred to herein as a 
"bifunctional polypeptide or protein"), one domain being 
the fatty acid desaturase, and the other domain being 
cytochrome b 5/ a protein required to support the electron 

10 transfer events that enable the desaturase to function; 

and (2) they are customized for expression in the cytosol 
of plant cells, and further customized for expression in 
particular selected plant species. 

Design of synthetic genes of the present 

15 invention is accomplished in two broad steps. First, the 
two components (the desaturase -encoding component and the 
Cyt b 5 - encoding component) are selected and linked 
together, if they do not occur together naturally. 
Second, the DNA molecule is optimized for expression in 

2 0 the cytosol of a plant cell, or further for expression in 

a particular plant species, or group of species. 

With regard to the first step, it should be 
noted that several fungal, animal and plant species, 
including yeast, are now known to contain naturally- 
25 occurring genes encoding dual-domain cytoplasmic fatty 

acid desaturases. As mentioned above, the yeast and rat 
A- 9 desaturase genes have been expressed and shown to 
function in plants. However, prior to the present 
invention, it was not appreciated that the bifunctional 

3 0 yeast desaturase offers a significant advantage over the 

single- function animal desaturase in plant cells, where 
the requisite Cyt b 5 is available only in small amounts, 
and the yeast protein can provide its own supply of Cyt 
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With regard to the second step - optimization 
for expression in the plant cytosol - it was discovered 
in accordance with the present invention that a non-plant 
5 desaturase -encoding gene, such as the yeast OLE 1, though 
expressed in some plants, may not be optimally expressed 
in those plants. Furthermore, the inventors have found 
that the yeast gene is poorly expressed in other plant 
species, thus highlighting the advantages obtainable by 

10 optimizing such a gene for expression in a plant cell. 

Sections II-IV below describe in detail how to 
design and use the synthetic genes of the present 
invention. To the extent that specific materials are 
mentioned, it is merely for purposes of illustration and 

15 is not intended to limit the invention. Unless otherwise 
specified, general biochemical and molecular biological 
procedures, such as those set forth in Sambrook et al . , 
Molecular Cloning , Cold Spring Harbor Laboratory (198 9) 
(hereinafter "Sambrook et al . " ) or Ausubel et al . (eds) 

2 0 Current Protocols in Molecular Biology , John Wiley & Sons 
(1999) (hereinafter "Ausubel et al . " ) are used. 



II. Design and construction of the synthetic DNA molecules 

A. Selection of component DNA segments 

25 This invention contemplates the use of the 

following source DNAs, which are thereafter modified for 
expression in plants, if necessary: 

1. naturally occurring genes or cDNAs that 
encode dual domain polypeptides comprising a desaturase 

3 0 domain and a Cyt b 5 domain; 

2 . chimeric genes in which a desaturase- 
encoding sequence from one source (e.g., the desaturase 
domain of a dual domain fungal A- 9 desaturase, or the 
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single domain rat desaturase) , is linked to a Cyt b 5 - 
encoding sequence from a different source (e.g., a 
plant) ; 

3 . chimeric genes in which a sequence that 

5 encodes a fragment of a naturally occurring plant Cyt b 5 
(e.g. the heme binding fold, or residues that comprise 
the electron donor or acceptor sites, or residues that 
act as membrane targeting or retention signals, or 
residues that act to stabilize the protein in the plant 
10 cytoplasmic environment) is substituted for homologous 
regions in the cytochrome b 5 domain of a dual domain 
polypeptide such as the yeast A- 9 desaturase; and 

4. chimeric genes in which elements that encode 
the essential enzymatic domains from one source (e.g. a 

15 native or synthetic gene derived from a fungal A- 9 

desaturase) are linked to elements derived from native 
plant desaturases that enhance transcription, mRNA 
processing, mRNA stability, protein folding and 
maturation, membrane targeting or retention, or protein 

20 stability. 

Naturally occurring genes or cDNAs that encode 
dual domain desaturase/Cyt b 5 proteins have been 
identified in several fungal species, including 
Sac char omyces cerevisiae, Pichia augusta, Histoplasma 

25 capsulatum and Cryptococcus curvatus (See Fig. 1) . 
Naturally occurring genes or cDNA=s that encode 
independent, diffusible Cyt b 5 proteins have been 
identified in several plant species, including Nicotiana 
tabacum (tobacco) , Oryza sativa (rice) , Cuscuta reflexa 

30 (southern Asian dodder) , Arabidopsis thaliana, Brassica 
oleracea and Olea europaea (olive) . A N-terminal Cyt b 5 
domain of a A-6 desaturase has also been identified in 
the plant Borago officinalis , and in the Saccharomyces 
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cerevisiae FAH1 gene that encodes a very long chain fatty 
acid hydroxylase. Genes or cDNAs from these species, as 
well as DNA from any other species identified in the 
future as encoding such a dual domain protein, are 
5 contemplated for use in the synthetic genes of the 
present invention. 

In a preferred embodiment, the yeast OLE1 gene 
is used. This embodiment is described in detail in 
Example 1 . 

10 The second strategy involves linking a DNA 

segment encoding a fatty acid desaturase from one source 
with a Cyt b 5 domain from another source. In a preferred 
embodiment, this chimeric gene is fashioned after the 
naturally-occurring dual function genes discussed above. 

15 That is, the Cyt b 5 domain and the desaturase domain are 
situated in the same positions respective to each other 
as is found in the naturally occurring genes (see, e.g., 
Mitchell Sc Martin, J. Biol. Chem. 270 ; 29766-29772, 
1996) . 

2 0 The chimeric dual -domain proteins of the 

invention are prepared by recombinant DNA methods, in 
which DNA sequences encoding each domain are operably 
linked together such that upon expression, a fusion 
protein having the desaturase and Cyt b 5 functions 

25 described above is produced. As defined above, the term 
"operably linked" means that the DNA segments encoding 
the fusion protein are assembled with respect to each 
other, and with respect to an expression vector in which 
they are inserted, in such a manner that a functional 

30 fusion protein is effectively expressed. The selection 
of appropriate promoters and other 5 1 and 3 1 regulatory 
regions, as well as the assembly of DNA segments to form 
an open reading frame, employs standard methodology well 
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known to those skilled in the art. 

Thus, preparing the chimeric DNAs of the 
invention involves selecting DNA sequences encoding each 
of the aforementioned components and operably linking the 
5 respective sequences together in an appropriate vector. 
The sequences are thereafter expressed to produce the 
dual -function protein. 

Genes or cDNAs that encode single -function 
cytoplasmic A- 9 fatty acid desaturases have been 
10 identified in a diverse array of procaryotic and 

eucaryotic species, including insects, fungi and mammals, 
but not plants (Fig. 1) . Genes or cDNAs from any of 
these species, as well as DNA from any other species 
identified in the future as encoding a fatty acid 
15 desaturase, are contemplated for use in the synthetic 
genes of the present invention. 

In preferred embodiments, desaturase-encoding 
genes from eucaryotes, most preferably fungi or mammals, 
are used. In a particularly preferred embodiment, a DNA 

2 0 encoding the rat stearoyl CoA desaturase is used. This 

DNA has been successfully expressed in tobacco, and 
accordingly is expected to be useful as part of a 
chimeric desaturase/Cyt b 5 gene of the present invention. 

Genes or cDNAs that encode Cyt b 5 proteins have 
25 also been identified in a diverse array of eucaryotic 
species, including insects, fungi, mammals and plants. 
Genes or cDNAs from any of these species, as well as DNA 
from any other species identified in the future as 
encoding a Cyt b 5 protein, are contemplated for use in the 

3 0 synthetic genes of the present invention. 

In preferred embodiments, Cyt b 5 -encoding genes 
or cDNAs from plants are used. These DNAs are preferred 
because they naturally comprise the codon usage preferred 
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in plants, so require little, if any, of the modification 
steps described below for non-plant genes. Particularly 
preferred, if available, are Cyt b 5 -encoding DNAs from the 
same plant species (or group of species) to be 
5 transformed with the chimeric gene. For instance, 

synthetic chimeric genes constructed for transformation 
of Brassica species might comprise a stearoyl CoA- 
encoding domain from rat and a Cyt b 5 domain from Brassica 
(see Figs. 1 and 2 for specific sources). This chimeric 

10 DNA would require optimization for expression in Brassica 
only in the desaturase domain. 

With respect to the naturally-occurring dual 
domain- encoding genes, as well as the chimeric genes 
discussed above, it will be appreciated that the DNA 

15 molecules can be prepared in a variety of ways, including 
DNA synthesis, cloning, mutagenesis, amplification, 
enzymatic digestion, and similar methods, all available 
in the standard literature. Additionally, certain DNA 
molecules can be obtained by access to public 

2 0 repositories, such as the American Type Culture 

Collection. Alternatively, DNA molecules that are not 
readily available, and/or for which sequence information 
is not available, can be isolated from biological sources 
using standard hybridization methods and homologous 

25 probes that are available. 

B. Optimization for expression in plants 

The second step in designing the synthetic DNA 
molecules of the invention is to customize (i.e. 
30 optimize) their sequence for expression in the plant 
cytoplasm. This is accomplished by performing one or 
more of the steps listed below on the coding sequence of 
the above described non-plant (or chimeric) 
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desaturase/Cyt b 5 -encoding DNA molecules. 

1. From the peptide sequence encoded by the 
DNA, back translate using an appropriate plant codon 
usage table, making certain in particular that the most 

5 preferred translation termination codon is used. 

2. Visually, or with the aid of computer 
software, analyze the back- translated nucleotide sequence 
for features that could diminish or prevent expression in 
the plant cytoplasm. Such features include: (1) probable 

10 intron splice sites (characterized by T-rich regions) ; 
(2) plant polyadenylation signals (e.g., AATAAA) ; (3) 
polymerase II termination sequence (e.g., CAN (7 _ 9) AGTNNAA, 
where N is any nucleotide) ; (4) hairpin consensus 
sequences (e.g., UCUUCGG) ; and (5) the sequence- 

15 destabilizing motif ATTTA (Shah & Kamen, Cell 46.- 659- 
667, 1986) . These features have been described in the 
art (U.S. Patent No., 5,500,365 to Fischhoff et al . ; U.S. 
Patent No. 5,380,831 to Adang et al . ) . 

3 . Modify the back- translated sequence in 
20 light of any "problem" sequences identified in step 2. 

Note that this step may require the introduction of 
codons that are not the most preferred, but instead are 
second or third-most preferred, in order to eliminate the 
more problematic sequences identified in step 2. 

25 4. Introduce desirable cloning features, such 

as restriction sites, into the sequence in a manner that 
does not materially affect the desired codon usage or 
final polypeptide sequence. 

The aforementioned optimization procedure can 

3 0 be performed so that the final polypeptide sequence is 
identical to the initial polypeptide sequence, even 
though the underlying nucleotide sequence has been 
modified. This is a preferred embodiment of the 
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invention. However, it is entirely feasible to modify 
the initial sequence such that the final sequence is not 
identical to the initial sequence, either by virtue of 
amino acid substitutions, insertions or deletions. The 
5 more that is known about the structure/function 

relationship in a particular desaturase protein, the more 
liberties can be taken in modifying the protein sequence 
during the DNA optimization process. For instance, the 
present inventors have shown that the entire "coiled 

10 coil" domain of the yeast OLE1 gene can be deleted, and 
the protein remains functional. Thus, it appears that 
OLE1 can tolerate significant modification in the encoded 
protein without losing its biological activity. 

Codon usage tables for a variety of plants, 

15 including general plant codon usage tables, tables for 
dicots, tables for monocots, and tables for particular 
species, are widely available. Some of these are 
reproduced in Example 1 below. One good location to 
access such tables is the website: 

20 

http : //biochem. octago . ac . nz . 80 0/Transterm/codons . html . 

In an exemplary embodiment of the present 
invention, the above process is applied to the coding 

25 sequence of the yeast OLE1 gene, which encodes a 

cytoplasmically expressed dual -domain protein comprising 
a A- 9 fatty acid desaturase domain and a Cyt b 5 domain. 
Optimization of the OLE1 gene for expression in 
Arabidopsis and related species is described in detail in 

3 0 Example 1. 

In another preferred embodiment, the coding 
sequence of the rat stearoyl CoA desaturase is modified 
for expression in plants according to the methods 
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described above. The modified sequence is operably 
linked to a coding sequence for a Cyt b 5 domain, 
preferably from a plant, and most preferably from 
Erassica . In this regard, it has been shown that 
5 expression of this rat desaturase in tobacco produces a 
functional protein that increases the 16:1 fatty acid 
content of plant tissues. Splice site prediction 
analysis of the rat desaturase reveals that there are no 
plant intron-like sequences within the open reading 

10 frame. However, codon usage analysis reveals that this 
desaturase possesses a number of codons that are not 
optimal for expression in plants, particularly 
Arabidopsis or Brassica. 

In another preferred embodiment, the protein 

15 coding sequences of the modified vectors described above 
are further modified to increase desaturase activity. 
This is done by altering specific amino acids in the 
encoded protein that control desaturase activity through 
post- translational modifications . These modifications 

2 0 are presumed to increase the level of desaturase activity 
in the host plant by stabilizing the desaturase protein 
or by increasing catalytic activity of the desaturase. 
Post translational modifications such as protein 
phosphorylation or dephosphorylation have been shown to 

2 5 alter activity of a number of enzymes by a number of 

different mechanisms. These include increasing or 
decreasing enzyme activity or protein stability, or 
changing the intracellular location of the enzyme. An 
examination of a wide range of A- 9 desaturase enzymes 

3 0 reveals the existence of a number of highly conserved 

potential phosphorylations sites that could serve as 
sequences that regulate desaturase activity. These are 
shown in bold face on the pile-up diagram in Figure 3 and 
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are summarized in Table 1 of Example 3 . The high degree 
of homology between these sites suggests that these 
sequences may also be recognized by host plant 
phosphorylating or dephosphorylating enzymes. If 
5 phosphorylation of an amino acid within one of the sites 
increases the activity of the desaturase, the nucleic 
acid sequence corresponding to that amino acid can be 
altered to encode a negatively charged amino acid at that 
site to permanently increase the activity of the protein 

10 in the host. If phosphorylation of an amino acid within 
the site reduces the activity of the desaturase enzyme, 
the nucleic acid sequence can be altered to replace that 
amino acid with a neutral amino acid that will 
permanently increase the activity of the enzyme. 

15 In another preferred embodiment, elements of 

the genes in the modified vectors described above are 
further modified and improved by the linkage or 
substitution of sequences derived from native plant ER 
lipid biosynthetic genes. Those sequences contain 

2 0 elements that improve the desaturase activity by 
increasing the efficiency of gene expression, 
intracellular protein targeting and/or enzyme stability. 
This is done by identifying elements of the engineered 
desaturase gene that can be replaced or linked with 

25 elements of a plant gene without significantly affecting 
the desired activity or specificity of the resulting 
enzyme. Genes and cDNAs that encode ER lipid 
biosynthetic enzymes from Brassica, Arabidopsis , 
Nicotiana tabacum, Borage, maize, sunflower and soybeans, 

30 as well as similar plant genes from any other species 

that are identified in the future, are contemplated for 
use in the synthetic genes of the present invention. 

In connection with the aforementioned 
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embodiment, but not limited thereto, it is particularly 
useful in many cases to pre-test constructs of the 
invention in a yeast expression system, in order to 
eliminate constructs that work poorly before taking the 
5 more labor- and time- intensive step of testing them in 

plants. Accordingly, this step may be incorporated into 
the methods described herein. 

10 III. Construction of vectors for transforming plant 

nuclei, and production of transgenic plants 
expressing synthetic genes of the invention 

The synthetic genes of the present invention 

15 are intended for use in producing transgenic plants that 
optimally express a dual - function desaturase/Cyt b 5 
protein in the cytoplasm of plant cells. Transformation 
of plant nuclei to produce transgenic plants may be 
accomplished according to standard methods known in the 

2 0 art. These include, but are not limited to, 

Agrobacterium vectors, PEG treatment of protoplasts, 
biolistic DNA delivery, UV laser microbeam, gemini virus 
vectors, calcium phosphate treatment of protoplasts, 
electroporation of isolated protoplasts, agitation of 

25 cell suspensions with microbeads coated with the 

transforming DNA, direct DNA uptake, liposome-mediated 
DNA uptake, and the like. Such methods have been 
published in the art. See, e.g., Methods for Plant 
Molecular Biology , Weissbach & Weissbach eds . , Academic 

30 Press, Inc. (1988) ; Methods in Plant Molecular Biology , 
Schuler & Zielinski, eds., Academic Press, Inc. (1989); 
Plant Molecular Biology Manual , Gelvin Schilperoort , 
Verma, eds., Kluwer Academic Publishers, Dordrecht 
(1993) ; and Methods in Plant Molecular Biology - A 

35 Laboratory Manual , Maliga, Klessig, Cashmore, Gruissem & 



WO 00/11012 



PCT/US99/19443 



Varner, eds. , Cold Spring Harbor Press (1994) . 

The method of transformation depends upon the 
plant to be transformed. The biolistic DNA delivery 
method is useful for nuclear transformation, and is a 
5 preferred method for practice of this invention. In 
another embodiment of the invention, Ag-roJbacteriLim 
vectors are used to advantage for efficient 
transformation of plant nuclei. 

In a preferred embodiment, the synthetic gene 

10 is introduced into plant nuclei in Agrobacterium binary 
vectors. Such vectors include, but are not limited to, 
BIN19 (Bevan, Nucl . Acids Res., 12: 8711-8721, 1984) and 
derivatives thereof, the pBI vector series (Jefferson et 
al., EMBO J., 6: 3901-3907, 1987), and binary vectors 

15 pGA482 and pGA492 (An, Plant Physiol., 81: 86-91, 1986). 
A new series of Agrobacterium binary vectors, the pPZP 
family, is preferred for practice of the present 
invention. The use of this vector family for plant 
transformation is described by Svab et al . in Methods in 

2 0 Plant Molecular Biology - A Laboratory Manual , Maliga, 

Klessig, Cashmore, Gruissem and Varner, eds., Cold Spring 
Harbor Press (1994) . 

Using an Agrobacterium binary vector system for 
transformation, the synthetic gene of the invention is 
25 linked to a nuclear drug resistance marker, such as 
kanamycin or gentamycin resistance. Agrobacterium- 
mediated transformation of plant nuclei is accomplished 
according to the following procedure: 

(1) the gene is inserted into the selected 

3 0 Agrobacterium binary vector; 

(2) transformation is accomplished by co- 
cultivation of plant tissue (e.g., leaf discs) with a 
suspension of recombinant Agrrojbacterium, followed by 
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incubation (e.g., two days) on growth medium in the 
absence of the drug used as the selective medium (see, 
e.g., Horsch et al . , Science 227 : 1229-1231, 1985); 

(3) plant tissue is then transferred onto the 
5 selective medium to identify transformed tissue; and 

(4) identified transf ormants are regenerated 
to intact plants. 

It should be recognized that the amount of 
expression, as well as the tissue specificity of 

10 expression of the synthetic genes in transformed plants 
can vary depending on the position of their insertion 
into the nuclear genome. Such position effects are well 
known in the art; see Weising et al . , Ann. Rev. Genet., 
22.: 421-477 (1988) . For this reason, several nuclear 

15 transf ormants should be regenerated and tested for 
expression of the synthetic gene. 

IV. Uses of the synthetic genes and transgenic 

plants expressing those crenes 

20 

The synthetic desaturase genes of the invention 
and transgenic plants expressing those genes can be used 
for several agriculturally beneficial purposes. For 
instance, they can be used in oil-producing crops (e.g. , 

25 com, soybean, sunflower, rapeseed) to increase the 

overall percentages of monounsaturated fatty acids in 
those oils, thereby improving their health-promoting 
qualities. In this regard, the production of transgenic 
rapeseed plants (Brassica napus) is of particular 

30 interest in this invention. Example 1 describes a 

synthetic yeast desaturase gene modified for expression 
in Arabidopsis . Because the codon usage of Brassica is 
very similar to that of Arabidopsis , it is expected that 
the synthetic gene described in Example 1 will be as well 



WO 00/11012 



PCT/US99/19443 



-25- 

expressed in Brassica as it is in Arabidopsis . 

Another use for the synthetic genes of the 
invention is to modify the flavors of certain fruit or 
vegetable crops. It has already been shown that 
5 expression of the un-modif ied yeast A- 9 desaturase gene 
in tomato results in alterations in fatty acid 
composition and fatty acid-derived flavor compounds (Wang 
et al . , 1996, supra). The synthetic, plant -optimized 
version of this gene is expected to function similarly, 

10 and also to be more efficiently expressed in plant cells. 

Another use for the synthetic genes of the 
invention is to facilitate the formation of omega-5 
anacardic acids, a class of secondary compounds derived 
from the A-9 desaturation of 14:0 in pest-resistant 

15 geraniums (Schultz et al . , Proc. Natl. Acad. Sci . USA, 

93.: 877-885, 1996) . It has been shown that formation of 
these compounds proceeds from the expression of A9 
desaturase activity resulting in the formation of A9 
14:1. Subsequent elongation of these molecules leads to 

20 the formation of omega-5 22:1 and 24:1 in the trichome 
exudate that leads to pest resistance against spider 
mites and aphids. 

Another use for the synthetic genes of the 
invention are in the modification of membrane lipid fatty 

25 acyl composition to alter the properties of the 

cytoplasmic and plasma membranes of the cell. These may 
affect functions such membrane associated activities that 
are associated with membrane functions such as signal 
transduction, endocytosis or exocytotic events, entry of 

3 0 fungal or viral pathogens into the cell, and temperature 
or environmentally caused stress that causes physical 
changes in the fluid properties of the plasma membrane or 
internal cell membranes. Plants defective in desaturases 
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have been reported (Somerville and Browse, supra.) . These 
mutant plants contain higher than normal levels of 
saturated fatty acids that may lower membrane fluidity 
under normal growing conditions. Thus the effects of 
5 temperature on these plants involved high temperature 
tolerance as opposed to chilling tolerance. These 
studies yielded interesting information that has 
relevance to temperature stress in general . A mutant of 
Arabidopsis deficient in 16:0 desaturation (Hugly et al , 

10 Plant Physiol. 90: 1134-1142) for example, has been shown 
to appear and grow normally at non-stressful 
temperatures. Under high temperature conditions, 
however, the mutant performs better than controls in 
growth and biosynthetic studies. Higher temperature 

15 stability was also noted in pea thylakoids following 

catalytic hydrogenation (Thoman et al . Biochem. Biophys. 
Acta 849 : 131-140, 1986) . 

2 0 The following examples are provided to describe 

the invention in greater detail. They are intended to 
illustrate, not to limit, the invention. 

EXAMPLE 1 

25 Modification of the Sac char omyces cerevislae OLE1 Gene 

for Expression in Arabldovsis and Related Species 

When introduced into tobacco and tomato plants, 
the yeast A- 9 desaturase gene {0LE1) was shown to 

3 0 desaturate palmitate and stearate, thereby reducing the 

levels of saturated fatty acids in triglycerides 
(Polashock et al . , supra; Wang et al . , supra). However, 
it was unclear whether optimum expression of the OLE1 
gene occurred in those species, and expression in other 
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plant species has been less than optimum. For example, 
the present inventors have found that the level of 
expression of the OLE1 gene in tobacco (Polashock, et . 
al., Plant Physiol. 100:894-901, 1992) and Arabidopsis 
5 varies in different plant tissues and is generally poor 
in tobacco, and Arabidopsis seeds. Similarly, data from 
other investigators indicate that expression of 0LE1 in 
rapeseed (Brassica napus) seeds is also poor (U.S. Patent 
No. 5,777,201, to Poutre, et al . ) . 

10 Differential expression of heterologous genes 

in plants can be caused by several factors. It is often 
due to the presence of cryptic intron splicing signals. 
Thus, it is possible that the multiple banding patterns 
observed in northern blots of OLE1- transformed tobacco 

15 (Polashock et al . , supra) are due to splicing of the OLE1 
mRNA. 

In plants, the mRNA splicing mechanism is less 
well defined than in mammalian or yeast systems. There 
is some conservation of the 5 1 and 3' splicing signals 

20 but there is no conserved internal splice signal. 

However, with the accumulation of plant genomic DNA 
sequence data, it is now becoming possible to predict 
with some accuracy where intron splicing will occur 
(Hebsgaard, S.M., P.G. Korning, N. Tolstrup, J . 

25 Engelbrecht, P. Rouze and S. Brunak, Nucleic Acids 

Research 24(17) : 3439-3452, 1996). In fact, computer 
programs that predict splice sites have now been 
developed (the "PlantNetGene" server for splice site 
predictions : http : //www. cbs . dtu . dk/Net PI ant Gene . html) . 

3 0 From these sources, it appears that plant introns are 
typically identified as T rich sequences. 

Another factor affecting expression of foreign 
genes in plants is codon preference. It is now well 
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known that preference for certain codons exist among 
different phyla, classes, families, genera and species. 
Accordingly, by modifying a DNA sequence so that it uses 
codons preferred in a particular organism, expression of 
5 that sequence can be optimized. 

Other factors affecting the expression of 
foreign genes in plants include the presence of putative 
polyadenylation signals, hairpin cleavage consensus 
motifs, polymerase II termination sequences and the Shaw- 

10 Kamen sequence pattern ATTTA. 

This example describes the design and 
construction of "pl-olel" , a modified Saccharomyces 
cerevisiae OLE1 gene optimized for expression in 
Arahidopsis and other plant species. 

15 The nucleotide sequence of the Saccharomyces 

cerevisiae OLE1 gene coding sequence has been described 
in U.S. Patent No. 5,057,419 to Martin et al . 
(incorporated by reference herein) and is set forth below 
for convenience as SEQ ID NO:l (open reading frame starts 

20 at +11) . The S. cerevisiae A-9 desaturase amino acid 
sequence encoded by OLE1 is set forth as SEQ ID NO:2. 
I • Design of pl-olel 

To modify OLE1 for optimum expression in 
plants, the OLE1 sequence was first analyzed for cryptic 

2 5 plant splice signals, using the PlantNetGene server for 

splice site predictions. This analysis identified a 
number of "high confidence" intron splice signals in the 
OLE1 sequence. These are shown below (positions 
correspond to position numbers in SEQ ID N0:1) . 

3 0 Donor splice site> direct strand : 

5 ' - 3 1 5 1 - 3 ■ 

Position Strand Confidence exon A intron 

(Start ATG = +1) 
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3 97 + 1.0 0 GCTCTCTCTG^GTAAAGTACC 

1052 + 0.85 CTATTAAGTG * GTACCAATAC 

1074 + 1.00 CCCAACTAAG^ GTTATCATCT 

Acceptor splice site, direct strand : 

5 1 - 3 ■ 5 1 - 3 ' 

Position Strand Confidence intron A exon 

50 0 + 0.86 GGTCTCACAG^ATCTTACTCC 



10 Next, the OLE1 peptide sequence (SEQ ID NO: 2) 

was back- translated using an Arabidopsis thai! ana codon 
usage table, as shown below. Codon usage in Arabidopsis 
and several other plant species, including Brassica 
napus, Phaseolus vulgaris and Zea mays is very similar, 

15 as can be seen by a comparison with the respective codon 
usage tables of those species, also shown below (the 
codon usage table of Saccharomyces cerevisiae is shown 
for comparison; codon usage tables taken from 
Ahttp : //biochem. otago . ac .nz : 8 0 0/ Transterm/codons .html) . 

20 



Arahidopsis thallana . 





AmAcid 


Codon 


Number 


/1000 


Fraction 




Gly 


GGG 


6027 . 00 


10 .31 


0 . 14 


25 


Gly 


GGA 


15393 . 00 


26.32 


0 .37 




Gly 


GGT 


14890 . 00 


25 .46 


0 . 35 




Gly 


GGC 


5654 . 00 


9 . 67 


0 . 13 




Glu 


GAG 


19825 . 00 


33 .90 


0 . 51 


30 


Glu 


GAA 


18672 . 00 


31 . 93 


0 .49 




Asp 


GAT 


20862 . 00 


35 . 67 


0 . 65 




Asp 


GAC 


11061 . 00 


18 . 91 


0.35 




Val 


GTG 


10414 . 00 


17 .81 


0 .26 


35 


Val 


GTA 


5145 . 00 


8 .80 


0 . 13 




Val 


GTT 


16157 . 00 


27 . 63 


0.41 




Val 


GTC 


8156 . 00 


13 . 95 


0 .20 




Ala 


GCG 


5361. 00 


9 . 17 


0 . 13 


40 


Ala 


GCA 


10552 .00 


18 . 04 


0.25 




Ala 


GCT 


18782 . 00 


32 . 12 


0 .45 
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Ala GCC 7249.00 12.40 0.17 

Arg AGG 6684.00 11.43 0.22 

Arg AGA 10280.00 17.58 0.34 

5 Ser AGT 7369.00 12.60 0.16 

Ser AGC 6399.00 10.94 0.14 

Lys AAG 20436.00 34.94 0.55 

Lys AAA 16882.00 28.87 0.45 

10 Asn AAT 11658.00 19.93 0.47 

Asn AAC 12987.00 22.21 0.53 

Met ATG 14817.00 25.34 1.00 

He ATA 6571.00 11.24 0.21 

15 He ATT 13028.00 22.28 0.41 

He ATC 11855.00 20.27 0.38 

Thr ACG 434 6.00 7.43 0.14 

Thr ACA 8703.00 14.88 0.28 

20 Thr ACT 10909.00 18.65 0.36 

Thr ACC 6720.00 11.49 0.22 

Trp TGG 6868.00 11.74 1.00 

End TGA 652.00 1.11 0.44 

25 Cys TGT 5641.00 9.65 0.58 

Cys TGC 4154.00 7.10 0.42 

End TAG 252.00 0.43 0.17 

End TAA 591.00 1.01 0.40 

30 Tyr TAT 8052.00 13.77 0.47 

Tyr TAC 8965.00 15.33 0.53 

Leu TTG 11727.00 20.05 0.22 

Leu TTA 6361.00 10.88 0.12 

35 Phe TTT 11703.00 20.01 0.47 

Phe TTC 13066.00 22.34 0.53 

Ser TCG 4830.00 8.26 0.10 

Ser TCA 9033.00 15.45 0.19 

40 Ser TCT 13022.00 22.27 0.28 

Ser TCC 6214.00 10.63 0.13 

Arg CGG 2531.00 4.33 0.08 

Arg CGA 3142.0 0 5.3 7 0.10 

45 Arg CGT 5680.00 9.71 0.19 

Arg CGC 2100.00 3.59 0.07 

Gin CAG 9564.00 16.35 0.47 

Gin CAA 10908.00 18.65 0.53 



WO 00/11012 



PCT/US99/19443 



-31- 

His CAT 7466.00 12.77 0.58 

His CAC 5415.00 9.26 0.42 

Leu CTG 5669.00 9.69 0.11 

5 Leu CTA 5350.00 9.15 0.10 

Leu CTT 14395.00 24.61 0.27 

Leu CTC 9751.00 16.67 0.18 

Pro CCG 4676.00 8.00 0.17 

10 Pro CCA 9131.00 15.61 0.33 

Pro CCT 10732.00 18.35 0.39 

Pro CCC 3331.00 5.70 0.12 



15 Brassica napus 

AmAcid Codon Number /1000 Fraction 

Gly GGG 73 0.0 0 11.21 0.13 

20 Gly GGA 2042.00 31.37 0.36 

Gly GGT 1952.00 29.99 0.35 

Gly GGC 892.00 13.70 0.16 

Glu GAG 2119.00 32.55 0.55 

25 Glu GAA 1764.00 27.10 0.45 

Asp GAT 1895.00 29.11 0.56 

Asp GAC 1478.00 22.70 0.44 

Val GTG 1231.00 18.91 0.28 

30 Val GTA 493.00 7.57 0.11 

Val GTT 1624.00 24.95 0.36 

Val GTC 1124.00 17.27 0.25 

Ala GCG 615.00 9.45 0.13 

35 Ala GCA 1167.00 17.93 0.24 

Ala GCT 2028.00 31.15 0.42 

Ala GCC 1056.00 16.22 0.22 

Arg AGG 697.00 10.71 0.22 

40 Arg AGA 996.00 15.30 .0.32 

Ser AGT 736.00 11.31 0.15 

Ser AGC 803.00 12.34 0.17 

Lys AAG 2243.00 34.46 0.55 

45 Lys AAA 1817.00 27.91 0.45 

Asn AAT 1058.00 16.25 0.37 

Asn AAC 1811.00 27.82 0.63 

Met ATG 1538.00 23.63 1.00 
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Ile ATA 669.00 10.28 0.20 

He ATT 1271.00 19.52 0.37 

He ATC 1461.00 22.44 0.43 

5 Thr ACG 563.00 8.65 0.15 

Thr ACA 1059.00 16.27 0.28 

Thr ACT 1154.00 17.73 0.3 0 

Thr ACC 1073.00 16.48 0.28 

10 Trp TGG 798.00 12.26 1.00 

End TGA 69.00 1.06 0.37 

Cys TGT 517.00 7.94 0.50 

Cys TGC 509.00 7.82 0.50 

15 End TAG 33.00 0.51 0.18 

End TAA 83.00 1.28 0.45 

Tyr TAT 792.00 12.17 0.38 

Tyr TAG 1283.00 19.71 0.62 

20 Leu TTG 1051.00 16.14 0.20 

Leu TTA 508.00 7.80 0.09 

Phe TTT 1003.00 15.41 0.39 

Phe TTC 1562.00 23.99 0.61 

25 Ser TCG 475.00 7.30 0.10 

Ser TCA 856.00 13.15 0.18 

Ser TCT 1147.00 17.62 0.24 

Ser TCC 799.00 12.27 0.17 

Arg CGG 219.00 3.36 0.07 

30 Arg CGA 297.00 4.56 0.09 

Arg CGT 659.00 10.12 0.21 

Arg CGC 275.00 4.22 0.09 

Gin CAG 1188.00 18.25 0.50 

35 Gin CAA 1168.00 17.94 0.50 

His CAT 651.00 10.00 0.49 

His CAC 672.00 10.32 0.51 

Leu CTG 592.00 9.09 0.11 

40 Leu CTA 579.00 8.89 0.11 

Leu CTT 1416.0 0 21.75 0.2 6 

Leu CTC 1208.00 18.56 0.23 

Pro CCG 542.00 8.33 0.15 

45 Pro CCA 1180.00 18.13 0.33 

Pro CCT 1281.00 19.68 0.36 

Pro CCC 527.00 8.10 0.15 
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Phaseolus vulgaris 



Gly GGG 371.00 13.30 0.15 

Gly GGA 771.00 27.64 0.32 

5 Gly GGT 817.00 29.29 0.34 

Gly GGC 441.00 15.81 0.18 

Glu GAG 912.00 32.69 0.54 

Glu GAA 767.00 27.50 0.46 

10 Asp GAT 776.00 27.82 0.55 

Asp GAC 625.00 22.41 0.45 

Val GTG 661.00 23.70 0.36 

Val GTA 174.00 6.24 0.09 

15 Val GTT 653.00 23.41 0.36 

Val GTC 346.00 12.40 0.19 

Ala GCG 180.00 6.45 0.09 

Ala GCA 52 8.0 0 18.93 0.2 6 

20 Ala GCT 791.00 28.36 0.39 

Ala GCC 553.00 19.82 0.27 

Arg AGG 324.00 11.61 0.29 

Arg AGA 325.00 11.65 0.29 

25 Ser AGT 317.00 11.36 0.14 

Ser AGC 353.00 12.65 0.15 

Lys AAG 1054.00 37.78 0.60 

Lys AAA 697.00 24.99 0.40 

30 Asn AAT 555.00 19.90 0.42 

Asn AAC 782.00 28.03 0.58 

Met ATG 567.00 20.33 1.00 

lie ATA 274.00 9.82 0.20 

35 lie ATT 539.00 19.32 0.40 

lie ATC 548.00 19.65 0.40 

Thr ACG 166.00 5.95 0.11 

Thr ACA 362.00 12.98 0.24 

40 Thr ACT 480.00 17.21 0.32 

Thr ACC 490.00 17.57 0.33 

Trp TGG 342.00 12.26 1.00 

End TGA 34.00 1.22 0.44 

45 Cys TGT 145.00 5.20 0.39 

Cys TGC 229.00 8.21 0.61 

End TAG 22.00 0.79 0.28 

End TAA 22.0 0 0.79 0.2 8 
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Tyr 
Tyr 

Leu 
Leu 
Phe 
Phe 



TAT 
TAC 

TTG 
TTA 
TTT 
TTC 



400 . 00 
597 . 00 

543 . 00 
184 . 00 
458 . 00 
601 . 00 



14 . 34 
21.40 

19 .47 
6 . 60 
16 .42 
21 . 55 



0.40 
0 . 60 

0 . 24 
0 . 08 
0.43 
0 . 57 



10 



Ser 
Ser 
Ser 
Ser 



TCG 
TCA 
TCT 
TCC 



149 . 00 
416 . 00 
606 . 00 
501 . 00 



5.34 
14 . 91 
21 . 72 
17 . 96 



0 . 06 
0 . 18 
0.26 
0 .21 



15 



20 



Arg 
Arg 
Arg 
Arg 

Gin 
Gin 
His 
His 



CGG 
CGA 
CGT 
CGC 

CAG 
CAA 
CAT 
CAC 



71 . 00 

76 . 00 
169 . 00 
158 . 00 

437 . 00 
470 . 00 
298 . 00 
355 .00 



2 
2 
6 
5 



55 
72 
06 
66 



15.67 
16.85 
10 . 68 
12 .73 



0 . 06 
0 . 07 
0 .15 
0 . 14 



0 
0 



48 
52 



0.46 
0.54 



25 



Leu 
Leu 
Leu 
Leu 



CTG 
CTA 
CTT 
CTC 



351. 00 
184 .00 
569.00 
452 . 00 



12 . 58 
6.60 
20.40 
16 .20 



0 . 15 
0 . 08 
0.25 
0.20 



30 



Pro 
Pro 
Pro 
Pro 



CCG 
CCA 
CCT 
CCC 



147. 00 
694 . 00 
664 . 00 
352 .00 



5 .27 
24 .88 
23 .80 
12 . 62 



0 .08 
0 .37 
0 . 36 
0 . 19 



3 5 Zea mays 

AmAcid Codon Number /1000 Fraction 

Gly GGG 2466.00 15.07 0.19 

40 Gly GGA 2186.00 13.36 0.17 

Gly GGT 2607.00 15.93 0.20 

Gly GGC 5499.00 33.61 0.43 

Glu GAG 7364.00 45.01 0.72 

45 Glu GAA 2823.00 17.25 0.28 

Asp GAT 3425.00 20.93 0.37 

Asp GAC 5740.00 35.08 0.63 

Val GTG 4365.00 26.68 0.38 
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Val GTA 916.00 5.60 0.08 

Val GTT 2516.00 15.38 0.22 

Val GTC 3644.00 22.27 0.32 

5 Ala GCG 3698.00 22.60 0.24 

Ala GCA 2517.00 15.38 0.16 

Ala GCT 3602.00 22.01 0.24 

Ala GCC 5481.00 33.50 0.36 

10 Arg AGG 2500.00 15.28 0.27 

Arg AGA 1199.00 7.33 0.13 

Ser AGT 1170.00 7.15 0.10 

Ser AGC 2776.00 16.97 0.24 

15 Lys AAG 7241.00 44.25 0.79 

Lys AAA 1969.00 12.03 0.21 

Asn AAT 1946.00 11.89 0.33 

Asn AAC 3939.00 24.07 0.67 

20 Met ATG 4071.00 24.88 1.00 

lie ATA 1014.00 6.20 0.13 

lie ATT 2099.00 12.83 0.28 

lie ATC 4403.00 26.91 0.59 

25 Thr ACG 1890.00 11.55 0.22 

Thr ACA 1620.00 9.90 0.19 

Thr ACT 1757.00 10.74 0.21 

Thr ACC 3236.00 19.78 0.38 

30 Trp TGG 1994.00 12.19 1.00 

End TGA 199.00 1.22 0.45 

Cys TGT 770.00 4.71 0.28 

Cys TGC 1963.00 12.00 0.72 

35 End TAG 121.00 0.74 0.28 

End TAA 12 0.00 0.73 0.27 

Tyr TAT 1303.00 7.96 0.27 

Tyr TAC 3440.00 21.02 0.73 

40 Leu TTG 1807.00 11.04 0.13 

Leu TTA 582.0 0 3.56 0.04 

Phe TTT 1697.00 10.37 0.29 

Phe TTC 4082.00 24.95 0.71 

45 Ser TCG 1620.00 9.90 0.14 

Ser TCA 1592.00 9.73 0.14 

Ser TCT 1792.00 10.95 0.15 

Ser TCC 2746.00 16.78 0.23 
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Arg CGG 1505.0 0 9.2 0 0.16 

Arg CGA 610.00 3.73 0.06 

Arg CGT 1018.00 6.22 0.11 

Arg CGC 2562.00 15.66 0.27 

5 

Gin CAG 4280.00 26.16 0.72 

Gin CAA 1626.00 9.94 0.28 

His CAT 1378.00 8.42 0.36 

His CAC 2431.00 14.86 0.64 

10 

Leu CTG 4069.00 24.87 0.29 

Leu CTA 904.00 5.52 0.07 

Leu CTT 2415.00 14.76 0.17 

Leu CTC 4079.00 24.93 0.29 

15 

Pro CCG 2642.00 16.15 0.29 

Pro CCA 2152.00 13.15 0.23 

Pro CCT 2102.00 12.85 0.23 

Pro CCC 2344.00 14.33 0.25 

2 0 

Saccharomyces cerevl si ae 

AmAcid Codon Number /100 0 Fraction 

25 

Gly GGG 18129.00 6.18 0.12 

Gly GGA 32850.00 11.20 0.22 

Gly GGT 66575.00 22.69 0.45 

Gly GGC 28821.00 9.82 0.20 

30 

Glu GAG 57100.00 19.46 0.30 

Glu GAA 133513.00 45.51 0.70 

Asp GAT 111120.00 37.88 0.65 

Asp GAC 58642.00 19.99 0.35 

35 

Val GTG 32144.00 10.96 0.20 

Val GTA 35470.00 12.09 0.22 

Val GTT 63678.00 21.71 0.39 

Val GTC 33136.00 11.30 0.20 

40 

Ala GCG 18402.00 6.27 0.11 

Ala GCA 47728.00 16.27 0.30 

Ala GCT 58916.00 20.08 0.37 

Ala GCC 35917.00 12.24 0.22 

45 

Arg AGG 27990.00 9.54 0.21 

Arg AGA 61524.00 20.97 0.47 

Ser AGT 42499.00 14.49 0.16 

Ser AGC 29298.00 9.99 0.11 
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Lys AAG 89539.00 30.52 0.42 

Lys AAA 124327.00 42.38 0.58 

Asn AAT 106379.00 36.26 0.60 

Asn AAC 71659.00 24.43 0.40 

5 

Met ATG 61216.00 20.87 1.00 

lie ATA 53773.00 18.33 0.28 

lie ATT 88869.00 30.29 0.46 

lie ATC 49422.00 16.85 0.26 

10 

Thr ACG 24131.00 8.23 0.14 

Thr ACA 52363.00 17.85 0.31 

Thr ACT 58260.00 19.86 0.34 

Thr ACC 35998.00 12.27 0.21 

15 

Trp TGG 30707.00 10.47 1.00 

End TGA 1901.00 0.65 0.30 

Cys TGT 23942.00 8.16 0.62 

Cys TGC 14448.00 4.93 0.38 

20 

End TAG 1421.0 0 0.4 8 0.23 

End TAA 2985.00 1.02 0.47 

Tyr TAT 55441.00 18.90 0.57 

Tyr TAC 42016.00 14.32 0.43 

25 

Leu TTG 79248.00 27.01 0.28 

Leu TTA 77691.00 26.48 0.28 

Phe TTT 78451.00 26.74 0.59 

Phe TTC 53809.00 18.34 0.41 

30 

Ser TCG 25856.00 8.81 0.10 

Ser TCA 55962.00 19.08 0.21 

Ser TCT 69019.00 23.53 0.26 

Ser TCC 41460.00 14.13 0.16 

35 

Arg CGG 5414.00 1.85 0.04 

Arg CGA 9166.00 3.12 0.07 

Arg CGT 18429.00 6.28 0.14 

Arg CGC 7924.00 2.70 0.06 

40 

Gin CAG 36018.00 12.28 0.31 

Gin CAA 78385.00 26.72 0.69 

His CAT 40211.00 13.71 0.64 

His CAC 22609.00 7.71 0.36 

45 

Leu CTG 31503.00 10.74 0.11 

Leu CTA 39789.00 13.56 0.14 

Leu CTT 36697.00 12.51 0.13 

Leu CTC 16401.00 5.59 0.06 
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Pro 
Pro 
Pro 
Pro 



CCG 
CCA 
CCT 
CCC 



15796 . 00 
51725 . 00 
39402 . 00 
20387 . 00 



5 .38 
17 . 63 
13 .43 

6 . 95 



0 . 12 
0.41 
0 .31 
0 . 16 



5 



For each amino acid, the new pl-olel gene was 



designed the codon most preferred in Arabidopsis , with 
the following exceptions: 



10 



1. The codon for glutamine CAG was 



switched to CAA. Though the codon preference for 
glutamine is the same for both CAG and CAA in 
Arabidopsis , CAA was used since the AG motif is part of 
the 3' intron splice signal. 



15 



2. In OLE1, there are regions of high 



leucine /valine amino acid usage (e.g., between positions 
322 to 571 of the nucleotide sequence are codons coding 
for 11 leucines and 7 valines) . These regions correspond 
to the OLE1 protein transmembrane domains. If the most 

20 preferred codons in Arabidopsis (CTT and GTT, 

respectively) were used, the region would take on the 
characteristics of a plant intron, i.e., high T content, 
thereby introducing a number of highly probable 5* splice 
sites, which could not be removed without altering the 

25 amino acid sequence. Accordingly, a mixture of 

alternative codons was used for these amino acids. 
Similar changes were also applied to two other regions of 
OLE1 (positions 781 to 900 and positions 1081 to 1140) . 



3 0 as putative polyadenylation signals, hairpin cleavage 
consensus motifs, ATTTA motifs or concatamers thereof, 
was conducted. Such sequences are described in detail in 
U.S. Patent No. 5,380,831 to Adang et al . (incorporated 
by reference herein) . This search identified one hairpin 

35 cleavage consensus motif, CTTCGG, at position 553-559 of 



Next, a search for problematic sequences, such 
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SEQ ID NO:l, which was removed by changing TTC to TTT 
(both encoding phenylalanine) . 

Next, a BamHI site and translation initiation 
consensus were added to the 5 * end of the OLE1 coding 
5 sequence (M. Kozak, J. Biol. Chem. 266 (30) : 19867-19870, 
1991) . An Xbal and a BamHI site were added to the 3 1 end 
of the coding sequence. A Pad site was introduced into 
the same position as the original S. cerevisiae OLE1 Pad 
site (within the cytochrome b 5 domain) , in order to 

10 provide a convenient restriction site for construction of 
this and other synthetic OLE1 genes. Other convenient 
restriction sites, which enable modular construction of 
synthetic OLE1 genes, are inherent within the final 
sequence of the new pl-olel gene. 

15 Finally, the termination codon was checked 

against a stop codon consensus database, "TransTerm" 
(Dalphin et al . , Nucl . Acids Res. 25 (1) : 246-247, 1997). 
The existing termination sequence, TGAT, appeared 
suitable for use in Arabidopsis , and so was not altered. 

20 II • Construction of vl-olel t 

The rebuilt pl-olel nucleotide sequence was 
constructed commercially (Operon Technologies, Inc.). 
The plasmid containing the rebuilt gene was designated 
pAMCM013 . The pl-olel nucleotide sequence is set forth 

25 below as SEQ ID NO: 3 (open reading frame starts at +11) . 
This sequence encodes SEQ ID N0:2, but differs from the 
S. cerevisiae OLE1 gene (SEQ ID NO:l) in the following 
respects (summarized from above) : 

1. Arabidopsis thaliana codon usage; CAG 
3 0 switched to CAA for glutamine; 

2. Translation initiation consensus added; 

3. Hairpin removed; 

4. Several (but not all) PlantNetGene 
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predicted splice sites removed; 

5. Eleven leucines changed from CTT to CTC, 
and 7 valines changed from GTT to GTG in positions 322- 
571, which corresponds to a plant intron-like region; 

5 similar changes made in regions 781-900 and 1081-1140; 
valine at position 432 retained as GTT to maintain 
Pspl406I site; 

6. Certain leucine and valine codons were 
altered so that the same codons would not appear adjacent 

10 to others; 

7. Intron acceptor site at position 1047 

altered; 

8. Restriction sites added to allow modular 
construction; PsP1406I site removed at position 1441; and 

15 9. Pad site introduced at position 13 62; an 

introduced NgoMI site at position 8 67 removed. 

A gap alignment of SEQ ID NO: 1 (top) and SEQ 
ID NO: 3 (bottom) is shown below: 

Gap alignment of wild type and rebuilt OLE1 sequences. 
Percent Similarity: 79.871 Percent Identity: 79.871 

... . . 

1 TACAACAAAGATGCCAACTTCTGGAACTACTATTGAATTGATTGACGACC 50 

III Mill lllllllllllllllll II I I I II II I 
1 ggatccaacaATGCCTACTTCTGGAACTACTATCGAGCTTATCGATGATC 5 0 

. • * • • 

51 AATTTCCAAAGGATGACTCTGCCAGCAGTGGCATTGTCGACGAAGTCGAC 100 

1 1 1 1 II llllllll Mill Ml II M M II II II 

51 AATTCCCTAAGGATGATTCTGCTTCTTCTGGAATCGTTGATGAGGTTGAT 100 

... . • 

101 TTAACGGAAGCTAATATTTTGGCTACTGGTTTGAATAAGAAAGCACCAAG 150 

I II II Mill II I I I I [ I I 1 I I M Mill II II II 
101 CTTACTGAGGCTAACATCCTTGCTACTGGACTTAACAAGAAGGCTCCTAG 150 

. « . > • 

151 AATTGTCAACGGTTTTGGTTCTTTAATGGGCTCCAAGGAAATGGTTTCCG 2 00 

III M Mill II II III I Mill II Mill llllllll I 

151 AATCGTTAACGGATTCGGATCTCTTATGGGATCTAAGGAGATGGTTTCTG 2 00 

• « * • * 

2 01 TGGAATTCGACAAGAAGGGAAACGAAAAGAAGTCCAATTTGGATCGTCTG 2 50 

I I! Mill [ ! E 1 1 I I I 1 1 1 1 1 I llllllll II I III I II 

2 01 TTGAGTTCGATAAGAAGGGAAACGAGAAGAAGTCTAACCTTGATAGACTT 2 50 



WO 00/11012 



PCT/US99/19443 



-41- 

• ■ - ♦ • 

251 CTAGAAAAGGACAACCAAGAAAAAGAAGAAGCTAAAACTAAAATTCACAT 3 00 

II 1 1 Mill MINIM II II II Mill Mill II II II 

2 51 CTTGAGAAGGATAACCAAGAGAAGGAGGAGGCTAAGACTAAGATCCATAT 3 00 

a ■ • • • 

3 01 CTCCGAACAACCATGGACTTTGAATAACTGGCACCAACATTTGAACTGGT 350 

III II Mill MUM I II 1:1 MUM I MINI 

3 01 CTCTGAGCAACCTTGGACTCTcAACAACTGGCATCAACATCTcAACTGGC 3 5 0 
3 51 TGAACATGGTTCTTGTTTGTGGTATGCCAATGATTGGTTGGTACTTCGCT 4 00 

I MINIM II 1 1 Mill Mill Mill II f 1 1 1 1 1 1 1 II 1 1 

3 51 TcAACATGGTgCTcGTcTGTGGAATGCCTATGATCGGATGGTACTTCGCT 400 

• • ■ • • 
401 CTCTCTGGTAAAGTACCTTTGCATTTAAACGTTTTCCTTTTCTCCGTTTT 4 50 

MINIM II 1 1 1 III I III I i 1 1 i 1 1 1 1 1 1 1 INN II II 

4 01 CTcTCTGGAAAaGTgCCTCTcCATCTcAACGTTTTCcTcTTCTCTGTcTT 4 5 0 

■ • • • • 

4 51 CTACTACGCTGTCGGTGGTGTTTCTATTACTGCCGGTTACCATAGATTAT 50 0 

NNIMINN II II N INN INN II E 1 1 1 1 1 1 1 1 I I 

451 CTACTACGCTGTTGGAGGAGTgTCTATCACTGCTGGATACCATAGACTcT 50 0 

» . • - • 

501 GGTCTCACAGATCTTACTCCGCTCACTGGCCATTGAGATTATTCTACGCT 55 0 

NINII MM I INN INN I III I I MM 

501 GGTCTCATAGATCTTACTCTGCTCATTGGCCTCTTAGACTcTTCTACGCT 550 

• * • • • 

5 51 ATCTTCGGTTGTGCTTCCGTTGAAGGGTCCGCTAAATGGTGGGGCCACTC 60 0 

INN II llllllll INN II II INN MINIM M II 

551 ATCTTtGGATGTGCTTCTGTTGAGGGATCTGCTAAGTGGTGGGGACATTC 600 

• ■ • ■ • 

601 T C AC AGAATT C AC CAT CG T TAC A CTGATAC C TTGAGAGAT C C TTATGAC G 65 0 

ill inn ii in i milium i minimi n i 

601 T CAT AGAAT C CAT C AT AGAT AC AC T GAT AC TCTT AGAGAT C C T TAC GAT G 65 0 

• » • • ■ 

651 CTCGTAGAGGTCTATGGTACTCCCACATGGGATGGATGCTTTTGAAGCCA 7 00 

II I INN II llllllll II IIIIIIIMIIINI I INN 

651 CTAGAAGAGGACTTTGGTACTCTCATATGGGATGGATGCTTCTTAAGCCT 7 00 

. • a * • 

701 AATCCAAAATACAAGGCTAGAGCTGATATTACCGATATGACTGATGATTG 75 0 

ii ii ii iiiiiiiiiimiiiiiii ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

701 AACCCTAAGTACAAGGCTAGAGCTGATATCACTGATATGACTGATGATTG 750 

■ » • • • 

751 GACCATTAGATTCCAACACAGACACTACATCTTGTTGATGTTATTAACCG 8 00 

I I I 1 1 I 1 1 1 I I I 1 1 I I I I I I I I 1 1 1 ! I I 1 1 II I I I I I I I 

751 GA C TAT CAGATT C C AAC AT AGAC AT TAC AT C t Tg C T c ATG CT c C TT AC TG 800 

» • • • • 

8 01 CTTTCGTCATTCCAACTCTTATCTGTGGTTACTTTTTCAACGACTATATG 8 50 

II II 1 1 1 II II I II II I II II N I I II II llllllll II III 

801 CTTTCGTgATCCCTACTCTcATCTGTGGATACTTCTTCAACGATTACATG 850 
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851 GGTGGTTTGATCTATGCCGGTTTTATTCGTGTCTTTGTCATTCAACAAGC 900 

II II I Mill II II II II I II II Mill MM 

851 GGAGGACTcATCTACGCTGGATTCATCAGAGTgTTCGTcATCCAACAAGC 900 

■ • • • • 

901 TACCTTTTGCATTAACTCCATGGCTCATTACATCGGTACCCAACCATTCG 950 

Ml M II II Mill 1 1 1 1 1 1 II I II 1 1 II 1 1 II Mill MM 

9 01 TACTTTCTGTATCAACTCTATGGCTCATTACATCGGAACTCAACCTTTCG 950 

■ • • • • 

951 ATGACAGAAGAACCCCTCGTGACAACTGGATTACTGCCATTGTTACTTTC 1000 

MM MMMM Ml I M MMMM MMI M MMMM! 

951 ATGATAGAAGAACTCCTAGAGATAACTGGATCACTGCTATCGTTACTTTC 1000 

■ • • • • 

10 01 GGTGAAGGTTACCATAACTTCCACCACGAATTCCCAACTGATTACAGAAA 1050 

II II II I I [ 1 1 1 1 1 1 1 1 1 1 1 II II Mill MMMM Mill 

10 01 GGAGAGGGATACCATAACTTCCATCATGAGTTCCCTACTGATTAtAGaAA 1050 

■ • • " ■ 
1051 CGCTATTAAGTGGTACCAATACGACCCAACTAAGGTTATCATCTATTTGA 1100 

MMM MMMMMIMMI! M MM! M MMMM MM 

1051 CGCTATCAAGTGGTACCAATACGATCCTACTAAaGTgATCATCTACtTgA 1100 

• • • • • 
1101 CTTCTTTAGTTGGTCTAGCATACGACTTGAAGAAATTCTCTCAAAATGCT 1150 

Mill I II II II II Mill I Mill 1 1 1 1 i E 1 1 1 1 1 III 

1101 CTTCTCTcGTgGGACTTGCTTACGATCTcAAGAAGTTCTCTCAAAACGCT 1150 

• > ■ ■ • 

1151 ATTGAAGAAGCCTTGATTCAACAAGAACAAAAGAAGATCAATAAAAAGAA 12 00 

M M M M I M MMMM MMMMMMM M MMI 

1151 ATCGAGGAGGCTCTTATCCAACAAGAGCAAAAGAAGATCAACAAGAAGAA 12 0 0 

a • • • • 

12 01 GGCTAAGATTAACTGGGGTCCAGTTTTGACTGATTTGCCAATGTGGGACA 1250 

MMMMMM MMI II Ml I MMM I M MMMM I 

12 01 GGCTAAGATtAAtTGGGGACCTGTTCTTACTGATCTTCCTATGTGGGATA 12 5 0 

. • • • • 

12 51 AACAAACCTTCTTGGCTAAGTCTAAGGAAAACAAGGGTTTGGTTATCATT 13 0 0 

I Mill III I MMMMMMM MMMM I MMMM 

1251 AGCAAACTTTCCTTGCTAAGTCTAAGGAGAACAAGGGACTTGTTATCATC 13 00 

■ • • • ■ 

13 01 TCTGGTATTGTTCACGACGTATCTGGTTATATCTCTGAACATCCAGGTGG 13 50 

Mill M Mill II II Mill II MMMM Mill II II 

13 01 TCTGGAATCGTTCATGATGTTTCTGGATACATCTCTGAG'CATCCTGGAGG 13 50 

» • • • • 

13 51 TGAAACTTTAATTAAAACTGCATTAGGTAAGGACGCTACCAAGGCTTTCA 14 0 0 

II MIMIIMM Mill I M Mill Mill MMMM 

13 51 AGAGACTt t aATt AAGACTGCTCTTGGAAAGGATGCTACTAAGGCTTTCT 14 00 

• • • • • 

14 01 GTGGTGGTGTCTACCGTCACTCAAATGCCGCTCAAAATGTCTTGGCTGAT 14 5 0 

III M II III I M II M II MMMM II I MMM 

14 01 CTGGAGGAGTTTACAGACATTCTAACGCTGCTCAAAACGTGCTTGCTGAT 14 50 

» • • • • 

14 51 ATGAGAGTGGCTGTTATCAAGGAAAGTAAGAACTCTGCTATTAGAATGGC 15 0 0 
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II 

14 51 ATGAGAGTTGCTGTTATCAAGGAGTCTAAGAACTCTGCTATCAGAATGGC 15 00 



15 01 TAGTAAGAGAGGTGAAATCTACGAAACTGGTAAGTTCTTTTAAGCATCAC 1550 



1501 TTCTAAGAGAGGAGAGATCTACGAGACTGGAAAGTTCTTCTGAtctagag 1550 



1551 ATTAC 1555 
I I 

1551 gatcc 1555 



The pl-olel synthetic gene contains no intron- 
like regions, or predicted splice sites within its 
sequence. Moreover, comparing the codon usage of 
Arabidopsis with that of Brassica napus , Phaseolus 
5 vulgaris or Zea mays, with the exception of cystein (a 
rare amino acid that comprises 1.7% of all Arabidopsis 
codons, and occurs 4 times (0.8%) in OLE1) , the sequence 
contains no rare codons for any of those species. The 
codon usage of pl-olel is particularly similar to the 

10 preferred usage of Brassica napus. Accordingly, pl-olel 
is expected to be particularly well expressed in all 
those species, and well expressed in any plant species. 

An alternative version of pl-olel, referred to 
herein as pl-olel-2 , was also constructed. This 

15 synthetic gene was modified only in specific codons 

identified as high frequency splicing signals. It was 
discovered that this construct is expressed equally as 
well as pl-olel in Arabidopsis . 



2 0 EXAMPLE 2 

Vacuum Infiltration Transformation of 
Arabidovsis thaliana with vl-olel 

A modification of a transformation protocol of 
2 5 Pam Green (http://www.bch.msu.edu/pamgreen/vac.html) was 
used for the transformation of A. thaliana with pl-olel. 
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The protocol was adapted from protocols by Nicole 
Bechtold and Andrew Bent. This protocol gives very good 
results, with 95% of all infiltrated plants giving rise 
to transf ormants , and a transf ormant in up to 1 in 2 5 
5 seeds . 

PROTOCOL ; 

1. Seeds of Arabidopsis thaliana ecotype 
Columbia were sown in lightweight plastic pots prepared 

10 in the following way: mound Arabidopsis soil mixture into 
3 to 4 inch pots, saturate soil with Arabidopsis 
fertilizer, add more soil so that it is rounded about 0.5 
above the edge . 

2. Plants were grown under conditions of 16 
15 hours light / 8 hours dark at 20°C, fertilizing with 

Arabidopsis fertilizer once a week from below, adding 
about 0.5 L to each flat. After 4-6 weeks, plants were 
considered ready for vacuum infiltration when primary 
inflorescence was 10-15 cm tall and the secondary 

2 0 inflorescences appeared at the rosette. The bolts were 

clipped back and 2 to 3 days was allowed for them to 
regrow before infiltration. 

3. In the meantime, the construct was 
transformed into Agrobacterium tumefaciens strain 

25 (LBA44 04) . When plants were ready to transform, a 50 mL 
culture of LB medium containing 5 0 mg/L kanamycin and 5 0 
mg/L of streptomycin was inoculated with a 1 mL overnight 
starter culture. 

4. Cultures were grown overnight at 28° C with 

3 0 shaking. The culture was pelleted, the supernatant 

removed, and the pellet resuspended in 250 ml of 
infiltration medium to OD600 >0 . 8 . Infiltration medium 
(1 liter) comprised 2 . 2 g MS salts, 1 X B5 vitamins, 50 g 
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sucrose, 0.5 g MES, pH to 5.7 with KOH, 0.044 =B5M 
benzylaminopurine, 200 =B5L Silwet L-77 (OS1 
Specialties) . 

5. The resuspended culture was placed in a 
magenta jar inside a large bell jar. Pots containing 
plants to be infiltrated were inverted into the solution 
so that the entire plant was covered, including rosette, 
but none of the soil was submerged. 

6. A vacuum of 4 00 mm Hg (about 17 inches) was 
drawn. Once the vacuum level was reached, the suction was 
closed and the plants allowed to remain under vacuum for 
five minutes. The vacuum was then quickly released. The 
pots were briefly drained, then placed on 

their sides in a tray, which was covered with a humidome 
to maintain humidity. The next day, the plants were 
removed to the growth room, the pots uncovered and set 
upright. Plants infiltrated with different constructs 
were kept separated in different trays thereafter. 

7. Plants were allowed to grow under the same 
conditions as before. Plants were staked individually as 
the bolts grew. When plants were finished flowering, 
water was gradually reduced, then eliminated to allow the 
plants to dry out. Seeds were harvested from each plant 
individually . 

8. Large selection plates were prepared: 4.3 
g/L MS salts; 1 X B5 vitamins (optional); 1 % sucrose; 
0.5 g/L MES pH to 5.7 with KOH; 0.8% phytagar - 
Autoclaved, then added antibiotics (35 /Ltg/mL kanamycin 
and 250 /zg/mL of carbenicillin) and 150 X 15 mm plates 
were poured. 

9. Plates were dried well in the sterile hood 
before plating - 20-30 minutes with the lids open was 
usually sufficient. 
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10. For each plant, up to 100 /iL of seeds 
(approximately 2500 seeds) was sterilized and plated out 
individually. Seeds were sterilized as follows: 1 min 
in 70% ethanol, 7 minutes in 50% bleach / 0.02 % Triton 

5 X-100 with vortexing, 6 rinses in sterile distilled 
water. Seeds were resuspended in 2 mL sterile 0.1% 
agarose and poured onto large selection plates as if 
plating phage. Plates were tilted so seeds were evenly 
distributed, and allowed to sit 10-15 minutes, during 
10 which time the liquid soaked into the medium. Plates 
were sealed with Parafilm and placed in a growth room. 

11. After 7 to 10 days, transf ormant s were 
visible as dark green plants. These were transferred 
onto "hard selection" plates (100 x 15 mm plates with 

15 same recipe as selection plates but with 1.5 % phytagar) 
to eliminate any pseudo-resistants , then replaced in the 
growth room. 

12. After 10 to 14 days, the plants possessed 
at least two sets of true leaves. At this point, plants 

20 were transferred to soil, covered with plastic, and moved 
to a growth chamber with normal conditions. They were 
typically kept covered for several days. 

References : 

25 Bechtold N, Ellis J, Pelletier G (1998) Methods 

Mol Biol, 82.: 259-266. 

Bent A, Kunkel BN, Dahlbeck D, Brown KL, 

Schmidt R, Giraudat J, Leung J, Staskawicz BJ (1994) 

Science 265 : 1856-1860. 
30 Koncz C, Schell J (1986) Mol. Gen. Genet. 204: 

383-396 . 

Solutions : 
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1000X B5 vitamins (10 mL) : 

1000 mg myo-inositol 
100 mg thiamine -HCl 
10 mg nicotinic acid 
5 10 mg pyridoxine-HCl 

Dissolve in ddH20 and store at -20 °C. 

Arabidopsis fertilizer (10 liters) : 

50 mL 1M KN03 
25 mL 1M KP04 (pH 5.5) 
2 0 mL 1M MgS04 
20 mL 1M Ca (N03) 2 
5 mL 0.1M Fe. EDTA 

10 mL micronutrients (see below) 

Dissolve in ddH20 and store at room temperature 

Arabidopsis micronutrients (500 mL) : 

70 mL 0 . 5M boric acid 
14 mL 0.5M MnCl2 
2 0 2.5 mL 1M CuS04 

1 mL 0 . 5M ZnS04 
1 mL 0.1M NaMo04 
1 mL 5M NaCl 
0.05 mL 0 . 1M CuC12 
2 5 Dissolve in ddH20 and store at room temperature 



10 



15 



EXAMPLE 3 
Customizing OLE1 to Express 
Post- Trans lational Modifications 

30 

After determining the optimized codon 
preferences of OLE1 mRNA (or mRNA derived from another 
fungal or animal desaturase) for high level expression in 
the host plant, specific amino acids that are involved in 
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the post-translational control of enzyme activity or 
stability are altered to maximize the catalytic activity 
of the expressed enzyme. There are a number of protein 
kinase and/or phosphorylase consensus sequences that are 
5 highly conserved in the fungal and animal desaturases . 

These are shown below. First is shown a table of aligned 
potential phosphorylation sites in desaturases. Next is 
shown a pileup of A-9 fatty acid desaturases. PROSITE 
analysis of these desaturases predicts a number of 
10 potential phosphorylation sites, highlighted by bold 
underlined characters . 
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Pileup of A- 9 fatty acid desaturases showing potential 
phosphorylation sites : 



Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C . elegans 
S . cerevisiae 
P. angusta 
H . capsulatum 
M. rouxii 
C . curvatus 
C. merolae 



50 

< MPAHM LQE.ISSSY. 
-MPAHM LQE.ISSSY. 
-MPAHL LQEEISSSY. 

SSY. 

•MPAHL LQDDISSSY. 
•MPGHL LQEEMTSSYT 
• — MPP NAQAGAQSIS 



— MTVKTRSN IAKKIEKDGG 

MPTSGTTIEL IDDQFPKDDS ASSGIVDEVD LTEANILATG LNKKAPRIVN 



MTAKVESKVR EEEKGSNPST 



Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C. elegans 
S . cerevisiae 
P. angusta 
H . capsulatum 
M. rouxii 
C . curvatus 
C. merolae 



51 

TTTTTITEPP 
TTTTTITAPP 
TTTTT I TAP P 
TTTTTITAPS 
TTTTTITAPP 
TTTTTITEPP 
DSLIAAASAA 



PETQYLAVDP 
GFGSLMGSKE 

MGTKS 

MA 



-MSASTKQAS 
AAADDSGAVI 



SGNLQNGREK 
SG. . .NEREK 
SRVLQNGGGK 
SRVLQNGGGK 
PGVLQNGGDK 

SESLQ 

ADAGQS PTKL 

MPPQG 

NEIIQLQEES 
MVSVEFDKKG 
MTDVTAEEL . 
LNEAPTASPV 

MSN 

TTVAQPSGKP 
PTLKPRPKPA 



MKKVPLYLEE 
VKTVPLHLEE 
LEKTPLYLEE 
SEKTPQYVEE 
LETMPLYLED 

. KTVPLYLEE 
QEDSTGVLFE 
QTGGSWVLYE 
KKWPKCLPA 
NEKKSNLDRL 

. . SKDSVAMM 
AETAAGGKDV 
IATLT START 
VTNVIDPERD 
VEPLEREGVE 



DI RPE 

DI RPE 

DI RPE 

DI RPE 

DI RPD 

DI RPE 

CD VET 

TD AVN 

RLPTAACKAS 
LEKDNQEKEE 
LAKDRELKNK 
VTDAARRPNS 
KTESMKPPLP 
DFIVPDNYVT 
FDPQRGLVFE 



100 

MREDIHDPSY 
MKEDIHDPTY 
MRDD I YDPNY 
MKDD I YDPTY 
IKDDIYDPTY 
MKEDIYDPSY 
TDGGLVKD I T 
TDTD . . APVI 
QENGECQKIV 
AKTKIH.ISE 
YLKQKH.ISE 
EPKKVH . ITD 
KTKMPP.LFD 
RTVENM . KML 
KTRSSKWMSE 



Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C . elegans 
S . cerevisiae 
P. angusta 
H . capsulatum 
M. rouxii 
C . curvatus 
C. merolae 



101 

QDEEGPPPKL 
QDEEGPPPKL 
QDKEGPKPKL 
QDKEGPQGKL 
KDKEGPS PKV 
QDEEGPPPKL 
VMKKAEKRLL 
VPPSAEKREW 
FLEIVIPYKM 
QPWTLNNWHQ 
QPWTWENWHR 
TPITLANWHK 
QPVTSKNWTK 
PPVTWRNLHK 
KELNELPLLQ 



EYVWRNIILM 
EYVWRNI I LM 
EYVWRNIILM 
EYVWRNIILM 
EYVWRNIILM 
EYVWRNIILM 
KLVWRNI IAF 
KIVWRNVILM 
E I VWRNVAL F 
HLNWLNMVLV 
HINWLNFILV 
HISWLNVTLI 
FVNWPQAILL 
NIQWISFLAL 
RINWLS.TSI 



ALLHVGALYG 
VLLHLGGLYG 
GLLHLGALYG 
SLLHLGALYG 
SLLHLGALYG 
ALLHLGALYG 
GYLHLAALYG 
GMLH I GGVYG 
AALHFAAAIG 
CGMPMIGWYF 
LAVPFAG . . L 
IAIPIYG. .L 
CVTPLIALYG 
TIPPAMAIYG 
IFTPLIGT . L 



ITL.IPSSKV 
IIL.VPSCKL 
ITL.IPTCKI 
IIL.IPTCKI 
ITL.IPTCKF 
LVL.VPSSKV 
AYLMVTSAKW 
AYLFLTKAMW 
LYQLIFEAKW 
ALSGKVPLHL 
ISTKWVPLKL 
VQAYWVPLHL 
I FT. . TELTK 
LCT . . VPVQT 
IGIWFVPLQR 



150 

YTLLWGIFYY 
YTALFGIFYY 
YTFLWVLFYY 
YTLLWAFAYY 
YTWLWGVFYY 
YTLLWAFVYY 
QTCILAYFLY 
LTDLFAFFLY 
QTVIFTFLLY 
NVFLFSVFYY 
HTFVTAVILY 
KTALWAWYY 
KTLIWSWIYY 
KTFIWSWYY 
KTLVLAIVTY 
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Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C. elegans 
S . cerevisiae 
P . angusta 
H. capsulatum 
M . rouxi i 
C . curvatus 
C . merolae 



151 
L I S ALG I TAG 
MTSALGITAG 
VISALGITAG 
LLSAVGVTAG 
F VS ALG I TAG 
VISIEGIGAG 
VISGLGITAG 
LCSGLGITAG 
VFGGFGI TAG 
AVGGVS I TAG 
CFGGISITAG 
FMTGLGITAG 
FITGLGITAG 
FITGLGITAG 
FCCGLGITGG 



AHRLWSHRTY 
AHRLWSHRTY 
VHRLWSHRTY 
AHRLWSHRTY 
AHRLWSHRSY 
VHRLWSHRTY 
AHRLWAHRSY 
AHRLWAHKSY 
AHRLWSHKSY 
YHRLWSHRSY 
YHRHWAHRAY 
YHRLWAHCSY 
YHRMWSHRAY 
YHRLWAHRSY 
YHRLWSHRSY 



KARLPLRIFL 
KARLPLRIFL 
KARLPLRVFL 
KARLPLRVFL 
KARLPLRLFL 
KARLPLRIFL 
KAKWPLRVIL 
KARLPLRLLL 
KATTPMRIFL 
SAHWPLRLFY 
DCKLPVKIFF 
SATLPLKIYL 
RGTDLLRWFM 
NASKPLQYFL 
EAHWLVQVIL 



I I ANTMAFQN 
I I ANTMAFQN 
I I ANTMAFQN 
I I ANTMAFQN 
I I ANTMAFQN 
I I ANTMAFQN 
VIFNTIAFQD 
TLFNTLAFQD 
MILNNIALQN 
AIFGCASVEG 
ALFGASAVEG 
AAVGGGAVEG 
SFAGAGAVEG 
ALCGAGSVQG 
ACFGAAAFEG 



200 
DVYEWARDHR 
DVYEWARDHR 
DVFEWSRDHR 
DVYEWARDHR 
DVYEWARDHR 
DVYEWARDHR 
AAYHWARDHR 
AVI D WARDHR 
DVIEWARDHR 
SAKWWGHSHR 
SIKMWGHQHR 
SIRWWARGHR 
SIYWWSRGHR 
SIRWWSRGHR 
SARYWCRLHR 



Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C. elegans 
S . cerevisiae 
P. angusta 
H. capsulatum 
M. rouxi i 
C . curvatus 
C. merolae 



201 

AHHKFS ETHA 
AHHKFSETHA 
AHHKFSETDA 
AHHKFSETDA 
AHHKFSETHA 
AHHKFS ETYA 
VHHKYSETDA 
MHHKYSETDA 
CHHKWTDTDA 
I HHRYTDTLR 
VHHRYTDTPR 
AHHRYTDTDK 
AHHRWTDTDK 
AHHRYTDTKL 
AHHRYVDSDR 



DPHNSRRGFF 
DPHNSRRGFF 
DPHNSRRGFF 
DPHNSRRGFF 
DPHNSRRGFF 
DPHNSRRGFF 
DPHNATRGFF 
DPHNATRGFF 
DPHNTTRGFF 
DPYDARRGLW 
DPYDAKRGFW 
DPYSVRKGLL 
DPYSAHRGFF 
DPYSAHEGFW 
DPYAVEKGFW 



FSHVGWLLVR 

FSHVGWLLVR 

FSHVGWLLVR 

FSHVGWLLVR 

FSHVGWLLVR 

FSHVGWLLVR 

FSHVGWLLCK 

FSHVGWLLVR 

FAHMGWLLVR 

YSHMGWMLLK 

YS HMGWMLL V 

YSHIGWMVMK 

FSHFGWMLVQ 

HAHMGWMLI . 

YAHLWWMVFK 



KHPAVKEKGG 
KHPAVKEKGG 
KHPAVRE KGA 
KHPAVKEKGG 
KHPAVKE KGS 
KHPAVKEKGG 
KHPEVKAKGK 
KHPQIKAKGH 
KHPQVKEQGA 
PNP . . . KYKA 
PNP . . . RYKA 
QNP . . . KRIG 
RPK. . . NRIG 
KPR. . .GKIG 
LPR . . . QRQG 



250 

KLDMSDLKAE 
KLDMSDLKAE 
TLDLSDLRAE 
LLNMSDLKAE 
TLDLSDLEAE 
KLDMSDLKAE 
GVDLSDLRAD 
TIDLSDLKSD 
KLDMSDLLSD 
RAD I TDMTDD 
RADISDLLDD 
RTEITDLNED 
YADVAD LKAD 
VADISDLSKN 
RVD I TDLNAN 



Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C. elegans 
S . cerevisiae 
P . angusta 
H. capsulatum 
M . rouxi i 
C . curvatus 
C. merolae 



251 

KLVMFQRRYY 
KLVMFQRRYY 
KLVMFQRRYY 
KLVMFQRRYY 
KLVMFQRRYY 
KLVMFQRRYY 
PILMFQKKYY 
PILRFQKKYY 
PVLVFQRKHY 
WTIRFQHRHY 
WWRVQHRHY 
PVWWQHRNY 
HWAFQHKYY 
PWKWQHNNY 
PILRFQHRYY 



KPGLLLMCFI 
KPGLLLMCFI 
KPGVLLLCFI 
KPGILLMCFI 
KPGLLMMCFI 
KPAILLMCFI 
MILMPIACFI 
LTLMPLICFI 
FPLVILCCFI 
ILLMLLTAFV 
LLLMWMAFL 
LKWIFMGIV 
PYFALGMGFI 
VALLFFMGLA 
LQIAILFSFV 



LPTLVPWYCW 
LPTLVPWYCW 
LPTLVPWYLW 
LPTIVPWYCW 
LPTLVPWYFW 
LPTFVPWYFW 
I PTWPMYAW 
LPSYIPT.LW 
LPTIIPVYFW 
IPTLICGYFF 
FPAVLTHYLF 
FPMLVSGLGW 
FPTLVAGLGW 
FPTLVAGLGW 
IPLTISTLGW 



GETFLHSLFV 
GETFVNSLFV 
GESFQNSLFF 
GEAFPQSLFV 
GETFQNSVFV 
GEAFVNSLCV 
GESFMNAWFV 
GESAFNAFFV 
KETAF I AFYT 
ND . YMGGLIY 
ND . FWGGFIY 
GD . WFGGFIY 
GD . FRGGYFY 
GD . WWGGLFF 
GD . FWGGLVY 



300 

STFLRYTLVL 
STFLRYTLVL 
AT FLR YAWL 
AT FLRYAI VL 
AT FLRYAWL 
STFLRYTLVL 
ATMFRWCF I L 
CSIFRYVYVL 
AGTFRYCFTL 
AGFIRVFVIQ 
AGLLRAWIQ 
AGILRIFFVQ 
AGVLRL CFVH 
AGAARLVFVH 
ACLGRMLFVQ 



WO 00/11012 



PCT/US99/19443 



-52- 



Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C. elegans 
S . cerevisiae 
P. angusta 
H . capsulatum 
M. rouxii 
C . cur vat us 
C. merolae 



301 

NATWLVNSAA 
NATWLVNSAA 
NATWLVNSAA 
NATWLVNSAA 
NATWLVNSAA 
NATWLVNSAA 
NVTWLVNSAA 
NVTWLVNSAA 
HATWCINSAA 
QATFCINSLA 
QATFCVNSLA 
QATFCVNSLA 
HATFCVNSLA 
HSTFCVNSLA 
QSTFCVNSLA 



HLYGYRPYDK 
HL YGYRP YD K 
HMYGYRPYDK 
HLYGYRPYDK 
HLFGYRPYDK 
HLYGYRPYDK 
HKFGGRPYDK 
HLWGSKPYDK 
HYFGWKPYDS 
HYIGTQPFDD 
HWIGEQPFDD 
HWLGDQPFDD 
HYLGESTFDD 
HWLGETPFDN 
HWWGEQTFSR 



NIQSRENILV 
NIQSRENILV 
TINPRENILV 
TISPRENILV 
NISPRENILV 
NIDPRENALV 
FINPSENISV 
NINPVETRPV 
SITPVENVFT 
RRTPRDNW I T 
RRTPRDHVLT 
RNSPRDHIVT 
HNTPRDSWVT 
KHTPKDHFIT 
RHTSYDSVIT 



SLGSVGEGFH 
S LGAVGEGFH 
SLGAVGEGFH 
S LGAVGEGFH 
SLGAVGEGFH 
SLGCLGEGFH 
A I LAFGEGWH 
SLWLGEGFH 
TIAAVGEGGH 
AIVTFGEGYH 
ALVTFGEGYH 
ALVTLGEGYH 
ALVTMGEGYH 
ALVTVGEGYH 
ALVTLGEGYH 



350 

NYHHAFPYDY 
NYHHTFPFDY 
NYHHTFPYDY 
NYHHTFPYDY 
NYHHSFPYDY 
NYHHAFPYDY 
NYHHVFPWDY 
NYHHTFPWDY 
NFHHTFPQDY 
NFHHEFPTDY 
NFHHEFPSDY 
NFHHEFPSDY 
NFHHQFPQDY 
NFHHQFPMDF 
NFHHEFPHDY 



Rat 
Mouse 
Sheep 
Pig 
Human 
Hamster 
Drosophila 
Moth 
C. elegans 
S . cerevisiae 
P. angusta 
H - capsulatum 
M. rouxii 
C . curvatus 
C . merolae 
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SASEY . RWHI 
SASEY . RWHI 
SASEY . RWHI 
SASEY . RWHI 
SASEY . RWHI 
SASEY . RWHI 
KTAEFGKYSL 
KTAELGDYSL 
RTSEYS . LKY 
RNA. IKWYQY 
RNA . LKWYQY 
RNA. IEWHQY 
RNA. IKFGQY 
RNA. IKWYQY 
RNG . WWYHW 



NFTTFFIDCM 
NFTTFFIDCM 
NFTTFFIDCM 
NLTTFFIDCM 
NFNTFFIDWM 
NFTTFFIDCM 
NFTTAFIDFF 
NFTKMFIDFM 
NWTRVL I DT A 
DPTKVIIYLT 
DPTKWIYLL 
DPTKWTIWIW 
DPTKWKIIVL 
DPTKWFIWTM 
DPTKWVIRLL 



AALGLAYDRK 
AALGLAYDRK 
AAI GLAYDRK 
AALGLAYDRK 
AALGLTYDRK 
AALGLAYDRK 
AKI GWAYDLK 
AS IGWAYDLK 
AALGLVYDRK 
SLVGLAYDLK 
SKVGLAYNLK 
KQLGLAYDLK 
SWFGLAYELK 
AQLGLASHLK 
SWAGLAWHLV 



KVSKAAVLAR 
KVSKATVLAR 
KVSKAAVLGR 
KVSKAAIL — 
KVSKAAILAR 
KVSKAAVLAR 
TVSTDIIKKR 
TVSTDVIQKR 
TACDEIIGRQ 
KFSQNAIEEA 
KFSQNAIDQG 
QFRANEIEKG 
QFPTNEVTKG 
KFPDNEIKKG 
RFPRNELVKA 
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I KRTGDGSHK 
I KRTGDGSHK 
MKRTGEESYK 

I KRTGDGNYK 
I KRTGDGS CK 
VKRTGDGTHA 
VKRTGDGSHA 
VSNHGCDIQR 
LIQQEQKKIN 
ILQQQQKKLD 
RVQQLQKKID 
RLFMEEKRIQ 
QYTMKLMQLQ 
RLQVRQEILD 



401 450 

Rat SS* ~ 

Mouse SS 

Sheep SG 

Pig 

Human SG 

Hamster SG * 

Drosophila TWGWGDVDQP KEEIE . DAVI THKKSE 

Moth VWGWDDHEVH QEDKKLAAII NPEKTE 

C. elegans GKSIM 

S. cerevisiae KKKAKINWGP VLTDLPMWDK QTFLAKS . KE NKGLVIISGI VHDVSGYISE 

P. angusta RMRAKLNWGP QLSELPVWDK STFFEKA. KE QKGLVIISGI VHD CANFLTE 

H. capsulatum QRRAKLDWGI PLEQLPVIEW DDYVDQA.KN GRGLIAIAGV VHDVTDF I KD 

M. rouxii AQKAKLSYGT PLKDLPIYTW EEYQSLVLND NKKWVLIEGV LYDVEEFMKE 

C. curvatus EQSEKLEWPK HSNDLPVISW EDFQA..ESK TRAL I AVHGF IHDCSSFIED 

C. merolae EAKKRVDWGK PIESLPVWTW KDVQRLAKEE NRLLWIEGI VHDCTRFKVQ 
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451 500 

Rat 

Mouse 

Sheep 

Pig 

Human 

Hamster 

Drosophila 

Moth 

C. elegans 

S.cerevisiae HPGGETLIKT ALGKDATKAF SGGVYRHSNA AQNVLADMRV AVIKESKNSA 

P. angusta HPGGQALLKT SFGKDATMAF NGGVYAHSNA AHNLLATMRV AV I RDGGANG 

H. capsulatum HPGGKAMINS GIGKDATAMF NGGVYNHSNA AHNQLSTMRV GVIRGGCEVE 

M. rouxii HPGGMKYLST AVGKDMTTAF NGGIYNHSNG TRNLLTSLRV GVLRNGMQV. 

C. curvatus HPGGAHLIKR AIGTDSTTAF FGGVYDHSNA AHNLLAMMRV GVLDGGMEVE 

C. merolae HPGGQRILEF WNVRDATQAF NGDVYNHTKA ARNLLAHLRV AQLKEIYEPE 



Protein kinase (specifically cAMP- and 
cGMP- dependent) phosphorylation sites. There have been a 
number of studies relative to the specificity of cAMP- and 
cGMP- dependent protein kinases (Fremisco J.R. et al . , J. 
5 Biol. Chem. 255:4240-4245, 1980; Glass D.B., Smith S.B., J . 
Biol. Chem. 258:14797-14803, 1983; Glass D.B. et al . , J. 
Biol. Chem. 261:2987-2993, 1986). Both types of kinases 
appear to share a preference for the phosphorylation of 
serine or threonine residues found close to at least two 

10 consecutive N-terminal basic residues. It is important to 
note that there are quite a number of exceptions to this 
rule. However, the consensus pattern is as follows: 
[RK] (2)-x-[ST], where S or T is the phosphorylation site. 

Protein kinase C phosphorylation site. In vivo, 

15 protein kinase C exhibits a preference for the 

phosphorylation of serine or threonine residues found close 
to a C-terminal basic residue (Woodget J.R. et al . , Eur. J. 
Biochem. 161:177-184, 1986;. Kishimoto A. et al . , J. Biol. 
Chem. 260:12492-12499, 1985). The presence of additional 

20 basic residues at the N- or C-terminus of the 

target amino acid enhances the Vmax and Km of the 
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phosphorylation reaction. The consensus pattern is: 
[ST] -x- [RK] where S or T is the phosphorylation site. 

Casein kinase II phosphorylation site. Casein 
kinase II (CK-2) is a protein serine/threonine kinase whose - 
5 activity is independent of cyclic nucleotides and calcium. 
CK-2 phosphorylates many different proteins. The substrate 
specificity ( Pinna L.A., Biochim. Biophys. Acta 
1054:267-284, 1990) of this enzyme can be summarized as 
follows: (1) Under comparable conditions Ser is favored 

10 over Thr; (2) an acidic residue (either Asp or Glu) must be 
present three residues from the C- terminal end of the 
phosphate acceptor site; (3) additional acidic residues in 
positions +1, +2, +4, and +5 increase the phosphorylation 
rate (most physiological substrates have at least one 

15 acidic residue in these positions) ; (4) Asp is preferred to 
Glu as the provider of acidic determinants; and (5) a basic 
residue at the N-terminus of the acceptor site decreases 
the phosphorylation rate, while an acidic one will increase 
it. The consensus pattern is: [ST] -x (2) - [DE] where S or T 

20 is the phosphorylation site (note: this pattern is found in 
most of the known physiological substrates) . 

If phosphorylation of a specific site by any 
kinase is found to increase the catalytic activity or 
stability of the encoded desaturase protein, the 

25 phosphorylated serine or threonine residue is changed to 
encode a negatively charged amino acid (aspartic acid or 
glutamic acid) in order to permanently optimize the 
activity or the protein. If phosphorylation of a specific 
residue is found to decrease the activity or stability of 

3 0 the encoded desaturase, the affected serine or threonine 
encoding codon is altered to substitute a neutral or a 
positively charged amino acid that will permanently 
optimize the activity or stability of the protein. 
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EXAMPLE 4 

Further Modifications and Improvements of the 
Saccharomyces cerevisiae OLE1 Gene for Plant Expression 
5 Using Elements Derived from Native Plant Desaturase Genes 

The activity of the native or modified forms of 
the Saccharomyces cerevisiae OLE1 A- 9 desaturase gene in 
plant tissues may be further improved by the substitution 

10 or inclusion of elements derived from native plant 

desaturase genes. Favorable plant gene elements may 
include sequences that improve the expression of the 
modified gene at one or more levels, including the 
following: 1) transcription, 2) pre-mRNA processing, 3) 

15 mRNA transport from the nucleus to the cytoplasm, 4) mRNA 
stability 5) translation, 6) targeting or retention of the 
protein at the appropriate membrane surface or organelle 
surface, 7) protein folding and maturation, and 8) 
stability of the functional desaturase protein. 

2 0 The inventors have shown that the OLE1 gene can 

tolerate significant modifications without losing its 
biological activity. These modifications include deletion 
of the "coiled coil" region, the addition of 239 amino 
acids to the N- terminus of OLElp and truncation of 55 and 

2 5 60 amino acids from the N- terminal end of the protein. The 
inventors have also shown that modifications of the 5 1 and 
3 'untranslated regions of the OLE1 mRNA can significantly 
affect its stability. For example, removing a short open 
reading frame near the 5 1 "cap" region of the OLE1 mRNA 

30 increases its half-life in Saccharomyces from 12 minutes to 
approximately 25 minutes. The existence of elements in the 
mRNA that affect its stability indicate that other elements 
might also exist that affect the stability of an mRNA 
generated by a synthetic gene in another host organism. 

35 Plant desaturase gene elements that enhance the 
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f unction of the modified A- 9 desaturase gene are identified 
by a 2 -step method. STEP 1 involves isolating a series of 
DNA sequences from a cDNA that encodes a plant ER lipid 
biosynthetic enzyme. Those elements are linked, or 
5 inserted into regions of a native or "optimized" gene under 
control of a yeast promoter in a vector suitable for 
expression in Saccharomyces cerevisiae . The resulting 
vectors are then tested for their ability to produce 
functional desaturase enzymes in strains of Saccharomyces 

10 that contain an inactive form of the A- 9 fatty acid 
desaturase gene . 

In STEP 2 , plant desaturase sequences from the 
above vectors that are found to produce a functional A- 9 
desaturase gene are used to a isolate homologous sequences 

15 from plant genomic DNA. The isolated genomic sequences are 
used to construct a synthetic gene that produces an mRNA 
that encodes the same functional desaturase protein 
produced by the vector in step 1. In this instance, the 
genomic sequences encompass the same protein coding 

2 0 elements as those encoded by the homologous cDNA sequence 

and also include genomic elements that encode the 5 1 and/ 
or 3' untranslated regions of the plant desaturase mRNA. 
These combined genomic elements should differ from the cDNA 
derived sequences used in STEP 1 by containing authentic 
25 plant introns, (which may facilitate efficient and correct 
splicing of the chimeric mRNA in the plant nucleus) and 
signals that affect the mRNA stability, mRNA transport, and 
efficient translation of the mRNA in plant tissues. The 
chimeric plant / synthetic gene containing the genomic 

3 0 sequences is inserted into vectors under the control of 

plant seed- specif ic promoters and tested for expression and 
desaturase function in plants, including Brassica, 
Arabidopsis, maize and soybeans. 
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The following specific examples further 
illustrate these methods employing the Arabidopsis FAD2 
gene, which encodes an ER A12 -desaturase , as a source of 
plant desaturase DNA sequences . In the preferred 
5 embodiment, the source of the plant desaturase DNA would be 
the FAD2 homolog, or a related ER lipid biosynthetic gene, 
that is derived from the same plant species that is 
intended to be modified by the resulting vector for 
commercial use. 

10 

A. Substitution of the N- terminal OLE1 protein 
coding sequences and with N- terminal sequences from the 
derived from the Arabidopsis FAD 2 gene. 

1) A cDNA containing the FAD2 , A- 12 desaturase, 
15 mRNA coding sequence is isolated by reverse transcriptase - 
polymerase chain reaction (RT-PCR) of isolated mRNAs 
derived from Arabidopsis tissue or by direct DNA synthesis 
using the protein and DNA sequences set forth in SEQ ID 
NO:4 and SEQ ID NO : 5 (open reading frame starts at +93). 

2 0 2) The inventors have shown that substitution of 

transmembrane sequences of the OLE1 gene with transmembrane 
sequences from the Saccharomyces FAH2 gene abolishes A- 9 
desaturase activity. FAH2 encodes a sphingolipid fatty 
acid hydroxylase, which is an ER membrane protein. 
25 TMPredict analysis of the Arabidopsis FAD2 sequence 
indicates that the first transmembrane region of its 
encoded protein begins at residue + 52 and a similar 
analysis of the OLE1 sequence indicates that its first 
transmembrane sequence begins at residue +113 . Because the 

3 0 inclusion of potential membrane spanning elements from the 

plant desaturase could produce significant changes in the 
desaturase core enzyme structure that affect activity, only 
sequences encoding residues +1 to +52 of FAD 2 are tested 
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for functional linkages or substitutions in the 113 residue 
N- terminal region of OLE1. 

A series of PCR oligonucleotide primers are 
synthesized that include a 5' primer that complements 
5 sequences including +1 start codon of the FAD2 gene and 3 1 
primers that complement sequences ending, for example, at 
residues + 2 0, +3 5 and + 52 of the FAD2 gene. These are 
used to amplify a series of fragments of different lengths 
from the FAD2 cDNA that extend from the +1 codon through 

10 codon +52. A second PCR amplification is performed using a 
5 ' primer that is complementary to sequences that include 
the 5 1 end of the FAD2 mRNA and the 3 1 primer that includes 
codon +52. That amplification is done using Arabidopsis 
genomic DNA as a template. The amplified fragment from 

15 that reaction is cloned into a bacterial vector and 

subjected to DNA sequencing to detect the presence of 
introns within the genomic sequence. The cloned genomic 
fragment is also used to construct vectors for plant 
expression as indicated in STEP 2 of the method. 

2 0 The amplified cDNA fragments is inserted into 

yeast expression vectors that contain the native 0LE1 mRNA 
coding sequence under the control of the Saccharomyces 
galactose inducible, GAL1 promoter. Insertion of the plant 
DNA fragments can be done in several ways: 1) A fragment is 
25 inserted upstream of the OLE1 protein coding sequences so 

that its protein coding element is fused in frame to the +1 
codon of the OLE1 encoded protein, 2) the codons on the 
plant fragment could replace the equivalent OLE1 residues 
starting from the +1 ATG codon (e.g. a plant DNA fragment 

3 0 containing codons +1 -> +52 replaces OLE1 codons +1 -> 

+52) and 3) the full length fragment containing codons +1 - 
> + 52 of the plant gene is fused in frame to codon +114 of 
the OLE1 gene, replacing the OLE1 residues +1 -> +113 with 
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plant desaturase residues +1 -> +52. 

The resulting plasmids are transformed into a 
haploid olelA: :LEU2 strain of Saccharomyces . That strain 
contains a null, disrupted form of the OLE1 gene and 
5 therefore has a growth requirement for unsaturated fatty 

acids. The transformed Saccharomyces strains are grown on 
fatty acid depleted galactose medium to test for the 
ability of the induced chimeric gene to support growth of 
the strain without fatty acids. Transformed strains that 

10 grow on the fatty acid deficient medium are further 

analyzed to assess the effects of the plant sequences on 
desaturase function. This is done by Western blot 
analysis, to measure levels of the resulting desaturase 
protein and by fatty acid analysis of total cellular 

15 lipids, to assess the relative activity of the desaturase 
enzyme by comparison of the ratio of saturated to 
unsaturated fatty acids. 

3) Using information derived from the above 
tests, a chimeric desaturase gene is constructed using the 

20 amplified genomic DNA from the FAD2 gene. Construction, 
testing, and analysis these vectors is guided by the 
principle that the most desirable vector is one that 
maximizes the use of the plant gene sequences and minimizes 
the use of the Saccharomyces A- 9 desaturase gene sequences 

25 while retaining optimal desaturase function. Plant DNA 

fragments derived from the genomic DNA amplification that 
extend from the 5 1 end of the mRNA sequence to the longest 
sequence that produces optimal desaturase function in yeast 
are inserted into a vector containing the native A- 9 

30 desaturase gene (or one of its modified forms produced by 
the methods described above) . The fragment is inserted 
into the vector so that the 3 ' end of its protein coding 
sequence produces an mRNA that generates a protein sequence 
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identical to its counterpart derived from the FAD2 cDNA 
sequences. The resulting chimeric desaturase gene, which 
now encodes an mRNA that includes the FAD 2 5 1 untranslated 
region in addition to the modified protein coding 
5 sequences, is placed into a plant expression vector under 
the control of a suitable plant promoter and plant 
termination/ polyadenylation sequences . 

4) The resulting vectors containing the 
plant/yeast chimeric desaturase sequences are transformed 

10 into plants for testing and analysis of desaturase 
function. Suitable test plants include Arabidopsis 
thaliana , and Brassica napis. A method for transformation 
and analysis of desaturase gene expression in Arabidopsis 
is provided above. A method for transformation and 

15 analysis of yeast desaturase expression in Brassica napis 
is described in U.S. Patent No. 5,777,201 to Poutre et al . 
(incorporated by reference herein) . 

B. Insertion or substitution of Arabidopsis FAD2 
20 C- terminal protein coding sequences and 3 1 mRNA 

untranslated region sequences into native and modified 
forms of the OLE1 gene. 

The inventors have previously shown that proteins 
encoded by the Saccharomyces EL02 and EL03 genes contain a 

25 series of charged residues in their C- terminal region. 

These proteins are located on the ER surface and function 
in the biosynthesis of very long chain fatty acids as 
described in Oh et al . (J. Biol. Chem. 272: 17376- 
17384, 1997) (incorporated by reference herein) . They 

3 0 further showed that deletion of the region containing the 
charged residues causes the proteins to be mislocalized 
from their normal cellular locations in the endoplasmic 
reticulum, resulting in reduced function. Similar clusters 
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of charged residues occurs in the C-terminal region of the 
OLE1 gene that are apparently associated with ER retention 
or localization. These residues do not appear to be a part 
of the functional cytochrome b 5 domain. A detailed 
5 comparison of the C-terminal OLE1 and the Arabidopsis FAD2 
sequences show that the plant desaturase has similar, but 
not identical, clusters of charged residues to those in the 
OLE1 gene. These sequences are shown below: 

10 SEQ ID Nos : 6 and 7 : 

Comparison of the charged carboxyl terminal amino acids of 
Olelp (SEQ ID NO: 7) and the Arabidopsis Fad2p desaturase 
(SEQ ID NO: 6) (The region of the OLE1 gene shown does not 
appear to be a functional part of its cytochrome b 5 
15 domain) . 

+ - + - - -+- -++ + 
A.thaliana FAD2 WYVAMYREAK ECIYVEPDRE GDKKGVYWYN NKL* 

20 + -++++-- + 

S.cerevisiae OLE1 MRVAVIKESK NSAIRMASKR GEIYETGKFF * 

Methods similar to those shown in Section A can be used to 
25 identify Arabidopsis FAD 2 sequences that can replace the 
OLE1 C-terminal sequences to optimize gene expression, 
membrane targeting and ER retention of the chimeric enzyme. 

1) A series of oligonucleotide primers for PCR 
amplification are synthesized for isolation of elements in 
30 the C-terminal region of the FAD2 gene. A FAD2 DNA 

fragment encompassing that region is generated by PCR 
amplification of the cDNA clone. Alternatively, given the 
smaller size of the fragment it or modified forms of the 
plant fragment may be generated directly by DNA synthesis. 
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A fragment containing that region and its flanking 3 1 
untranslated region also is generated by PCR amplification 
of Arabidopsis genomic DNA as described above. That 
fragment is cloned into an appropriate vector and sequenced. 
5 as also described. 

2) Vectors are constructed that contain the plant 
DNA fragments linked to or substituted into the OLE1 C- 
terminal coding region as described in Section A. In this 
instance, the plant DNA fragments are linked in frame to 

10 the carboxyl terminal residues of the OLE1 protein coding 
region. 

3) The resulting vectors are transformed into the 
Saccharomyces olelA strain and tested for desaturase 
function as described in Section A. 

15 4) Using information derived from the above 

tests, chimeric desaturase genes containing the C- terminal 
plant sequences that produce functional desaturases are 
constructed using the amplified genomic DNA from the FAD 2 
gene, according to the principles outlined in Section A. 

20 The resulting sequences are employed to construct vectors 
that will express the chimeric plant/yeast gene under 
control of plant promoter and plant termination/ 
polyadenylation sequences. Those vectors are transformed 
into plants for testing and analysis of desaturase function 

25 as described above. 



The present invention is not limited to the 
embodiments described and exemplified above, but is capable 
of variation and modification without departure from the 
30 scope of the appended claims. 
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We claim: 

1. A synthetic fatty acid desaturase gene for 

5 expression in a multicellular plant, the gene comprising a 
desaturase domain and a cyt b 5 domain, wherein the gene is 
customized for expression in a plant cytoplasm. 

2. The synthetic gene of claim 1, customized 
10 from a naturally occurring gene encoding a cytosolic A-9 

desaturase . 

3. The synthetic gene of claim 2, customized 
from a naturally occurring gene from Sac char omyces 

15 cerevisiae . 

4. The synthetic gene of claim 3, customized 
from a naturally occurring gene from Sac char omyces 
cerevisiae that encodes SEQ ID NO:2. 

20 

5. The synthetic gene of claim 4, customized 
from a naturally occurring gene from Saccharomyces 
cerevisiae comprising SEQ ID NO : 1 . 

25 6. The synthetic gene of claim 3, comprising SEQ 

ID NO : 3 . 

7. The synthetic gene of claim 1, which further 
comprises an expression regulatory sequence from a plant 
30 gene encoding an ER biosynthetic pathway enzyme. 



8. The synthetic gene of claim 1, customized for 
expression in a monocotyledonous plant. 
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9. The synthetic gene of claim 1, customized for 
expression in a dicotyledonous plant. 

10. The synthetic gene of claim 1, customized 
5 for expression in a plant genus selected from the group 

consisting of Arabidopsis , Brassiest, Phaeseolus, Oryza, 
Olea, Elaeis (Oil Palm) and Zea. 

11. The synthetic gene of claim 1, customized 
10 from a naturally occurring gene comprising both a 

desaturase domain and a cyt b 5 domain. 

12. The synthetic gene of claim 1, wherein the 
gene is a chimeric gene comprising a desaturase domain and 

15 a heterologous cyt b 5 domain. 

13. The synthetic gene of claim 1, customized 
from a naturally occurring gene such that the synthetic 
gene and the naturally occurring gene encode an identical 

20 amino acid sequence. 

14. The synthetic gene of claim 13, wherein the 
synthetic gene and the naturally occurring gene encode SEQ 
ID NO: 2 . 



25 



30 



15. The synthetic gene of claim 1, customized 
from a naturally occurring gene such that the synthetic 
gene and the naturally occurring gene encode a similar 
amino acid sequence. 

16. The synthetic gene of claim 1, customized 
from a naturally occurring gene such that the synthetic 
gene and the naturally occurring gene encode a similar 
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amino acid sequence, and the synthetic gene possesses 
improved stability or catalytic activity as compared with 
the naturally occurring gene. 

5 17. A method for constructing a customized 

bifunctional desaturase/cyt b 5 encoding gene for expression 
in the cytosol of a multicellular plant, comprising the 
steps of: 

(a) providing a DNA molecule comprising a 
10 desaturase-encoding moiety operably linked to a cyt b 5 - 

encoding moiety, said DNA molecule producing the 
bifunctional polypeptide in a non-customized form; 

(b) back- translating the polypeptide 
sequence using preferred codons for expression in a 

15 multicellular plant, thereby producing a back- translated 
nucleotide sequence ; 

(c) analyzing the back- translated nucleotide 
sequence for features that could diminish or prevent 
expression in the plant cytoplasm; 

20 (d) modifying the analyzed sequence to 

correct or remove the features that could diminish or 
prevent expression in the plant cytoplasm; and 

(e) optionally, introducing pre -determined 
cloning features into the sequence in a manner that does 

25 not materially affect the codon usage or final polypeptide 
sequence, thereby producing the customized bifunctional 
desaturase/cyt b 5 encoding gene for expression in the 
cytosol of a multicellular plant. 

30 18. The method of claim 17, wherein the features 

that could diminish or prevent expression in the plant 
cytoplasm include one or more features selected from the 
group consisting of: putative intron splice sites, plant 
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polyadenylation signals, RNA polymerase II termination 
sequences, and hairpin consensus sequences. 

19. The method of claim 17, which further 
comprises the step of: 

(f) testing the customized bifunctional 
desaturase/cyt b 5 encoding gene for desaturase function in 
fatty acid deficient strains of a microorganism prior 
introducing the gene into vectors for expression in plants 

20. The method of claim 19, wherein the 
microorganism is Saccharomyces cerevisiae . 



21. The method of claim 17, which further 
15 comprises incorporating into the customized gene one or 
more genomic segments from plant desaturase or other ER 
lipid biosynthetic genes, which comprise beneficial 
elements to further optimize expression of the genes in 
plants, comprising the steps of: 

2 0 a) selecting a cDNA sequence that 

potentially comprises one or more of the beneficial 
elements ; 

b) creating a yeast vector expressing a 
desaturase gene modified to contain one or more of the 

25 beneficial elements; 

c) testing the vector in a yeast expression 

system ; 

d) isolating regions from genomic DNA that 
are homologous to the beneficial elements from the cDNA; 

3 0 and 

e) operably linking the genomic DNA regions 
to the customized bifunctional desaturase/cyt b 5 encoding 
gene to produce the further customized gene. 
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Synechococcus PCC7002 (U36390) 

Anabaena variabilis (D 14581) 
Kluyveromyces thermotolerans (D83186) 
Sacchammyces cerevisiae f J056761 

Pichia angusta (D83 185) 
Histoplasma capsulatum (X85962) 

Yarrowia lipolytica (D83187) 
Cryptococcus curvatus (Y10422) 

Tetrahymena pyriformis (D84474) 
Tetrahymena thermophila (D83478) 

Rat (J02585) 

Mesocricetus auratus (L26956) 
Mouse(M26270) 
Sus scrofa(Z97186) 
Homo sapiens (Y13647) 

Cyprinus carpio (U31864) 

Drosophila melanogaster (U73160) 
Amblyomma americanum (U03281) 
Caenorhabditis clegans (Z95123) 
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SaccharnmYcsft cerevisiae 

Pichia angusta (D831 85) 
Histoplasma capsu latum (X85962) 

Cryptococcus curvatus (Y10422) 
Homo sapiens (M60174) 
Monkey (S07959) 
Rabbit (M24844) 

Rat(M14992) 
Bovine (X136I7) 
Chicken (M32293) 
Tobacco (X7 1441) 

Borage (U7901 1) (N-terminal b5 fusion to a desaturase) 
Rice (X75670) 
Brassica (M87514) 

S. cerevisiae cytochrome b5 (L22494) 
S. cerevisiae cytochrome b2 (X03215) 

S. cerevisiae N-tenninal cytochrome b5 fusion to hydroxylase (Z49260) 
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<110> Martin, Charles E. 
Mitchell, Andrew 

<120> Synthetic Fatty Acid Desaturase Gene for 
Expression in Plants 

<130> 97-0081 PCT 

<150> US 60/097,586 
<151> 1998-08-24 

<160> 7 

<170> FastSEQ for Windows Version 3.0 

<210> l 
<211> 1555 
<212> DNA 

<213> Saccharomyces cerevisiae 
<400> 1 

tacaacaaag atgccaactt ctggaactac tattgaattg attgacgacc aatttccaaa 60 

ggatgactct gccagcagtg gcattgtcga cgaagtcgac ttaacggaag ctaatatttt 12 0 

ggctactggt ttgaataaga aagcaccaag aattgtcaac ggttttggtt ctttaatggg 180 

ctccaaggaa atggtttccg tggaattcga caagaaggga aacgaaaaga agtccaattt 24 0 

ggatcgtctg ctagaaaagg acaaccaaga aaaagaagaa gctaaaacta aaattcacat 3 00 

ctccgaacaa ccatggactt tgaataactg gcaccaacat ttgaactggt tgaacatggt 360 

tcttgtttgt ggtatgccaa tgattggttg gtacttcgct ctctctggta aagtaccttt 420 

gcatttaaac gttttccttt tctccgtttt ctactacgct gtcggtggtg tttctattac 480 

tgccggttac catagattat ggtctcacag atcttactcc gctcactggc cattgagatt 540 

attctacgct atcttcggtt gtgcttccgt tgaagggtcc gctaaatggt ggggccactc 600 

tcacagaatt caccatcgtt acactgatac cttgagagat ccttatgacg ctcgtagagg 660 

tctatggtac tcccacatgg gatggatgct tttgaagcca aatccaaaat acaaggctag 72 0 

agctgatatt accgatatga ctgatgattg gaccattaga ttccaacaca gacactacat 780 

cttgttgatg ttattaaccg ctttcgtcat tccaactctt atctgtggtt actttttcaa 840 

cgactatatg ggtggtttga tctatgccgg ttttattcgt gtctttgtca ttcaacaagc 900 

taccttttgc attaactcca tggctcatta catcggtacc caaccattcg atgacagaag 960 

aacccctcgt gacaactgga ttactgccat tgttactttc ggtgaaggtt accataactt 102 0 

ccaccacgaa ttcccaactg attacagaaa cgctattaag tggtaccaat acgacccaac 1080 

taaggttatc atctatttga cttctttagt tggtctagca tacgacttga agaaattctc 1140 

tcaaaatgct attgaagaag ccttgattca acaagaacaa aagaagatca ataaaaagaa 1200 

ggctaagatt aactggggtc cagttttgac tgatttgcca atgtgggaca aacaaacctt 1260 

cttggctaag tctaaggaaa acaagggttt ggttatcatt tctggtattg ttcacgacgt 132 0 

atctggttat atctctgaac atccaggtgg tgaaacttta attaaaactg cattaggtaa 13 8 0 

ggacgctacc aaggctttca gtggtggtgt ctaccgtcac tcaaatgccg ctcaaaatgt 1440 

cttggctgat atgagagtgg ctgttatcaa ggaaagtaag aactctgcta ttagaatggc 1500 

tagtaagaga ggtgaaatct acgaaactgg taagttcttt taagcatcac attac 1555 

<210> 2 
<211> 510 
<212> PRT 

<213> Saccharomyces cerevisiae 
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Met Pro Thr Ser 
1 

Lys Asp Asp Ser 

20 

Glu Ala Asn lie 
35 

Val Asn Gly Phe 
50 

Glu Phe Asp Lys 
65 

Leu Glu Lys Asp 

lie Ser Glu Gin 

100 

Trp Leu Asn Met 
115 

Phe Ala Leu Ser 
130 

Ser Val Phe Tyr 
145 

His Arg Leu Trp 

Leu Phe Tyr Ala 

180 

Trp Trp Gly His 
195 

Arg Asp Pro Tyr 
210 

Trp Met Leu Leu 
225 

Thr Asp Met Thr 

lie Leu Leu Met 

260 

Gly Tyr Phe Phe 
275 

lie Arg Val Phe 
290 

Ala His Tyr lie 
305 

Asp Asn Trp lie 

Phe His His Glu 

340 

Gin Tyr Asp Pro 
355 

Leu Ala Tyr Asp 
370 

Leu lie Gin Gin 
385 

Asn Trp Gly Pro 

Phe Leu Ala Lys 

420 

lie Val His Asp 
435 

Thr Leu lie Lys 
450 

Gly Gly Val Tyr 



Gly Thr Thr lie 
5 

Ala Ser Ser Gly 
Leu Ala Thr Gly 

40 

Gly Ser Leu Met 
55 

Lys Gly Asn Glu 
70 

Asn Gin Glu Lys 
85 

Pro Trp Thr Leu 

Val Leu Val Cys 

120 

Gly Lys Val Pro 
135 

Tyr Ala Val Gly 
150 

Ser His Arg Ser 
165 

lie Phe Gly Cys 

Ser His Arg lie 

200 

Asp Ala Arg Arg 
215 

Lys Pro Asn Pro 
230 

Asp Asp Trp Thr 
245 

Leu Leu Thr Ala 

Asn Asp Tyr Met 

280 

Val lie Gin Gin 
295 

Gly Thr Gin Pro 
310 

Thr Ala lie Val 
325 

Phe Pro Thr Asp 

Thr Lys Val lie 

360 

Leu Lys Lys Phe 
375 

Glu Gin Lys Lys 
390 

Val Leu Thr Asp 
405 

Ser Lys Glu Asn 

Val Ser Gly Tyr 

440 

Thr Ala Leu Gly 
455 

Arg His Ser Asn 
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Glu Leu lie Asp 
10 

lie Val Asp Glu 
25 

Leu Asn Lys Lys 

Gly Ser Lys Glu 

60 

Lys Lys Ser Asn 
75 

Glu Glu Ala Lys 
90 

Asn Asn Trp His 
105 

Gly Met Pro Met 

Leu His Leu Asn 

140 

Gly Val Ser lie 
155 

Tyr Ser Ala His 
170 

Ala Ser Val Glu 
185 

His His Arg Tyr 

Gly Leu Trp Tyr 

220 

Lys Tyr Lys Ala 
235 

lie Arg Phe Gin 
250 

Phe Val lie Pro 
265 

Gly Gly Leu lie 

Ala Thr Phe Cys 

300 

Phe Asp Asp Arg 
315 

Thr Phe Gly Glu 
330 

Tyr Arg Asn Ala 
345 

lie Tyr Leu Thr 

Ser Gin Asn Ala 

380 

lie Asn Lys Lys 
395 

Leu Pro Met Trp 
410 

Lys Gly Leu Val 
425 

lie Ser Glu His 

Lys Asp Ala Thr 

460 

Ala Ala Gin Asn 
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465 470 475 480 

Met Arg Val Ala Val lie Lys Glu Ser Lys Asn Ser Ala lie Arg Met 

485 490 495 

Ala Ser Lys Arg Gly Glu lie Tyr Glu Thr Gly Lys Phe Phe 

500 505 510 

<210> 3 
<211> 1555 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic yeast delta-9 desaturase gene modified 
for expression in plants 

<400> 3 

ggatccaaca atgcctactt ctggaactac tatcgagctt atcgatgatc aattccctaa 60 

ggatgattct gcttcttctg gaatcgttga tgaggttgat cttactgagg ctaacatcct 12 0 

tgctactgga cttaacaaga aggctcctag aatcgttaac ggattcggat ctcttatggg 180 

atctaaggag atggtttctg ttgagttcga taagaaggga aacgagaaga agtctaacct 24 0 

tgatagactt cttgagaagg ataaccaaga gaaggaggag gctaagacta agatccatat 300 

ctctgagcaa ccttggactc tcaacaactg gcatcaacat ctcaactggc tcaacatggt 360 

gctcgtctgt ggaatgccta tgatcggatg gtacttcgct ctctctggaa aagtgcctct 42 0 

ccatctcaac gttttcctct tctctgtctt ctactacgct gttggaggag tgtctatcac 480 

tgctggatac catagactct ggtctcatag atcttactct gctcattggc ctcttagact 54 0 

cttctacgct atctttggat gtgcttctgt tgagggatct gctaagtggt ggggacattc 600 

tcatagaatc catcatagat acactgatac tcttagagat ccttacgatg ctagaagagg 660 

actttggtac tctcatatgg gatggatgct tcttaagcct aaccctaagt acaaggctag 720 

agctgatatc actgatatga ctgatgattg gactatcaga ttccaacata gacattacat 780 

cttgctcatg ctccttactg ctttcgtgat ccctactctc atctgtggat acttcttcaa 840 

cgattacatg ggaggactca tctacgctgg attcatcaga gtgttcgtca tccaacaagc 900 

tactttctgt atcaactcta tggctcatta catcggaact caacctttcg atgatagaag 960 

aactcctaga gataactgga tcactgctat cgttactttc ggagagggat accataactt 102 0 

ccatcatgag ttccctactg attatagaaa cgctatcaag tggtaccaat acgatcctac 1080 

taaagtgatc atctacttga cttctctcgt gggacttgct tacgatctca agaagttctc 1140 

tcaaaacgct atcgaggagg ctcttatcca acaagagcaa aagaagatca acaagaagaa 1200 

ggctaagatt aattggggac ctgttcttac tgatcttcct atgtgggata agcaaacttt 1260 

ccttgctaag tctaaggaga acaagggact tgttatcatc tctggaatcg ttcatgatgt 132 0 

ttctggatac atctctgagc atcctggagg agagacttta attaagactg ctcttggaaa 1380 

ggatgctact aaggctttct ctggaggagt ttacagacat tctaacgctg ctcaaaacgt 1440 

gcttgctgat atgagagttg ctgttatcaa ggagtctaag aactctgcta tcagaatggc 1500 

ttctaagaga ggagagatct acgagactgg aaagttcttc tgatctagag gatcc 1555 

<210> 4 
<211> 383 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 4 

Met Gly Ala Gly Gly Arg Met Pro Val Pro Thr Ser Ser Lys Lys Ser 

15 10 15 

Glu Thr Asp Thr Thr Lys Arg Val Pro Cys Glu Lys Pro Pro Phe Ser 

20 25 30 

Val Gly Asp Leu Lys Lys Ala lie Pro Pro His Cys Phe Lys Arg Ser 

35 40 45 

lie Pro Arg Ser Phe Ser Tyr Leu lie Ser Asp lie lie lie Ala Ser 

50 55 60 

Cys Phe Tyr Tyr Val Ala Thr Asn Tyr Phe Ser Leu Leu Pro Gin Pro 
65 70 75 80 
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<210> 5 
<211> 1372 
<212> DNA 

<213> Arabidopsis thaliana 



<400> 5 

agagagagag attctgcgga ggagcttctt cttcgtaggg tgttcatcgt tattaacgtt 60 

atcgccccta cgtcagctcc atctccagaa acatgggtgc aggtggaaga atgccggttc 12 0 

ctacttcttc caagaaatcg gaaaccgaca ccacaaagcg tgtgccgtgc gagaaaccgc 18 0 

ctttctcggt gggagatctg aagaaagcaa tcccgccgca ttgtttcaaa cgctcaatcc 24 0 

ctcgctcttt ctcctacctt atcagtgaca tcattatagc ctcatgcttc tactacgtcg 300 

ccaccaatta cttctctctc ctccctcagc ctctctctta cttggcttgg ccactctatt 360 

gggcctgtca aggctgtgtc ctaactggta tctgggtcat agcccacgaa tgcggtcacc 42 0 

acgcattcag cgactaccaa tggctggatg acacagttgg tcttatcttc cattccttcc 480 

tcctcgtccc ttacttctcc tggaagtata gtcatcgccg tcaccattcc aacactggat 540 

ccctcgaaag agatgaagta tttgtcccaa agcagaaatc agcaatcaag tggtacggga 60 0 

aatacctcaa caaccctctt ggacgcatca tgatgttaac cgtccagttt gtcctcgggt 660 

ggcccttgta cttagccttt aacgtctctg gcagaccgta tgacgggttc gcttgccatt 72 0 

tcttccccaa cgctcccatc tacaatgacc gagaacgcct ccagatatac ctctctgatg 780 

cgggtattct agccgtctgt tttggtcttt accgttacgc tgctgcacaa gggatggcct 840 
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cgatgatctg cctctacgga gtaccgcttc tgatagtgaa tgcgttcctc gtcttgatca 90 0 

cttacttgca gcacactcat ccctcgttgc ctcactacga ttcatcagag tgggactggc 960 

tcaggggagc tttggctacc gtagacagag actacggaat cttgaacaag gtgttccaca 102 0 

acattacaga cacacacgtg gctcatcacc tgttctcgac aatgccgcat tataacgcaa 1080 

tggaagctac aaaggcgata aagccaattc tgggagacta ttaccagttc gatggaacac 114 0 

cgtggtatgt agcgatgtat agggaggcaa aggagtgtat ctatgtagaa ccggacaggg 12 00 

aaggtgacaa gaaaggtgtg tactggtaca acaataagtt atgagcatga tggtgaagaa 12 60 

attgtcgacc tttctcttgt ctgtttgtct tttgttaaag aagctatgct tcgttttaat 132 0 

aatcttattg tccattttgt tgtgttatga cattttggct gctcattatg tt 1372 
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<212> PRT 

<213> Arabidopsis thaliana 
<400> 7 
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15 10 15 
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